Skip to content

Fix symbol export#2052

Open
vyasr wants to merge 14 commits intorapidsai:mainfrom
vyasr:fix/symbol_export
Open

Fix symbol export#2052
vyasr wants to merge 14 commits intorapidsai:mainfrom
vyasr:fix/symbol_export

Conversation

@vyasr
Copy link
Copy Markdown
Contributor

@vyasr vyasr commented May 1, 2026

This PR adds symbol visibility controls to raft to avoid exporting weak symbols that it shouldn't.

Contributes to https://github.com/rapidsai/build-infra/issues/53

@vyasr vyasr self-assigned this May 1, 2026
@vyasr vyasr added improvement Improves an existing functionality non-breaking Introduces a non-breaking change labels May 1, 2026
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@vyasr vyasr force-pushed the fix/symbol_export branch from bc22612 to a4938b7 Compare May 1, 2026 02:20
@rapidsai rapidsai deleted a comment from copy-pr-bot Bot May 1, 2026
@rapidsai rapidsai deleted a comment from copy-pr-bot Bot May 1, 2026
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 1, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

3 similar comments
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 1, 2026

/ok to test

vyasr and others added 7 commits May 1, 2026 11:06
Add CUVS_EXPORT/CUVS_HIDDEN macros and set CXX_VISIBILITY_PRESET hidden
on cuvs_objs and cuvs_c targets. Apply namespace CUVS_EXPORT cuvs to all
installed C++ headers and CUVS_EXPORT to C API function declarations.

Also fix pairwise_distance.cpp to include its own C API header so that
the CUVS_EXPORT annotation on the function declaration is visible to the
compiler when building with hidden default visibility.
The check-c-abi workflow only uses c/include as its header path.
export.hpp was only present in cpp/include, causing CUVS_EXPORT to be
undefined when the ABI checker parsed C headers, breaking all function
signature comparisons.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Under -fvisibility=hidden, explicit template instantiations in source
files require CUVS_EXPORT to be exported from libcuvs.so. Without this,
test binaries linking against libcuvs.so get undefined symbols (e.g.
cuvs::neighbors::ivf_flat::index<__half, long>::n_lists()).

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
NVCC does not support __attribute__((visibility("default"))) on
explicit class template instantiations (error rapidsai#1091-D). These symbols
are already exported via the namespace-level CUVS_EXPORT on the header
declarations. Only function template instantiations (template void)
need the per-instantiation CUVS_EXPORT in .cu files.
This function was the only C API declaration in pq.h missing the
CUVS_EXPORT annotation, causing an undefined symbol error when the
Python wheel tried to load pq.abi3.so against libcuvs.so built with
-fvisibility=hidden.
… Doxygen

- Rename c/include/cuvs/core/export.hpp -> export.h (pure C macros
  should use .h extension; fixes Java jextract build which only copies
  *.h files to its staging directory)
- Update all 23 C API headers to include export.h instead of export.hpp
- In Doxyfile: set MACRO_EXPANSION=YES, EXPAND_ONLY_PREDEF=YES, add
  CUVS_EXPORT= and CUVS_HIDDEN= to PREDEFINED so Doxygen strips the
  visibility attribute before generating XML, preventing breathe from
  choking on CUVS_EXPORT in C function signatures
@vyasr vyasr force-pushed the fix/symbol_export branch from 92cf4b4 to 2845c84 Compare May 1, 2026 18:06
Temporarily point RAFT_FORK and RAFT_PINNED_TAG at the raft symbol
visibility PR branch to verify cuVS works with hidden-default raft.
@vyasr vyasr marked this pull request as ready for review May 1, 2026 18:11
@vyasr vyasr requested review from a team as code owners May 1, 2026 18:11
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 1, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Summary by CodeRabbit

  • Chores
    • Standardized and tightened library symbol visibility and export macros across the codebase.
    • Introduced a centralized export header and applied export annotations to public APIs for more consistent linking.
    • Updated build/docs configuration and copyright years.

Walkthrough

Adds a centralized symbol-visibility system (CUVS_EXPORT / CUVS_HIDDEN), configures CMake to hide symbols by default, and applies export annotations across C headers, C++ headers/namespaces, and selected source template instantiations; updates Doxygen config and minor includes/copyright years.

Changes

Symbol visibility / export annotations

Layer / File(s) Summary
Export Macro Definitions
c/include/cuvs/core/export.h, cpp/include/cuvs/core/export.hpp
New headers define CUVS_EXPORT and CUVS_HIDDEN (GCC-like visibility attributes or empty on other toolchains).
Build System Defaults
c/CMakeLists.txt, cpp/CMakeLists.txt
Targets configured with CXX_VISIBILITY_PRESET hidden and VISIBILITY_INLINES_HIDDEN ON to default-hide C++ symbols and inlines.
C API: header wiring
c/include/cuvs/core/all.h, c/include/.../*.h (many C headers listed in raw summary)
C headers now include cuvs/core/export.h via umbrella or per-file includes to enable CUVS_EXPORT. Copyright year updates in several headers.
C API: function annotations
c/include/cuvs/core/c_api.h, c/include/cuvs/cluster/kmeans.h, c/include/cuvs/distance/pairwise_distance.h, c/include/cuvs/neighbors/*.h, c/include/cuvs/preprocessing/*/*.h
Public C API function declarations across ~20+ headers are prefixed with CUVS_EXPORT (signatures unchanged).
C++ headers: include export and namespace restructure
cpp/include/cuvs/core/*.hpp, cpp/include/cuvs/cluster/*.hpp, cpp/include/cuvs/distance/*.hpp, cpp/include/cuvs/neighbors/*.hpp, cpp/include/cuvs/preprocessing/*.hpp, cpp/include/cuvs/util/*.hpp, cpp/include/cuvs/selection/*.hpp, cpp/include/cuvs/stats/*.hpp
~35 C++ headers now include cuvs/core/export.hpp and convert inline-qualified namespaces (namespace cuvs::...) into explicit nested namespaces with CUVS_EXPORT applied to the outer cuvs scope.
Source: explicit template instantiations marked exported
cpp/src/distance/kde.cu, cpp/src/neighbors/epsilon_neighborhood.cu, cpp/src/neighbors/ivf_flat_index.cpp
Explicit template instantiations are annotated with CUVS_EXPORT (e.g., template CUVS_EXPORT ..., template struct CUVS_EXPORT ...) to export those symbols.
Documentation config
cpp/doxygen/Doxyfile
Doxygen macro expansion enabled; CUVS_EXPORT/CUVS_HIDDEN added to PREDEFINED as empty and export.h/export.hpp excluded from input.
Minor compilation wiring
c/include/cuvs/distance/pairwise_distance.h, c/src/distance/pairwise_distance.cpp
Added missing include of pairwise_distance header and small SPDX year update to fix compilation/unit cohesiveness.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: adding symbol export controls to fix symbol visibility issues in the cuvs library.
Description check ✅ Passed The description is related to the changeset, explaining that the PR adds symbol visibility controls to avoid exporting weak symbols, contributing to a build infrastructure issue.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (8)
cpp/CMakeLists.txt (1)

931-940: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Propagate the visibility defaults to cuvs-cagra-search too.

cuvs-cagra-search is also compiled into the shared library, but it still builds with the compiler defaults. That leaves one object target outside the hidden-default setup, so the new export hygiene can still leak symbols.

♻️ Suggested fix
   set_target_properties(
     cuvs_objs
     PROPERTIES CXX_STANDARD 20
                CXX_STANDARD_REQUIRED ON
                CUDA_STANDARD 20
                CUDA_STANDARD_REQUIRED ON
                POSITION_INDEPENDENT_CODE ON
                CXX_VISIBILITY_PRESET hidden
                VISIBILITY_INLINES_HIDDEN ON
   )
+  set_target_properties(
+    cuvs-cagra-search
+    PROPERTIES CXX_VISIBILITY_PRESET hidden
+               VISIBILITY_INLINES_HIDDEN ON
+  )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cpp/CMakeLists.txt` around lines 931 - 940, The cuvs-cagra-search target is
not receiving the hidden-visibility defaults, allowing symbol leakage; update
the CMake configuration so cuvs-cagra-search is configured like cuvs_objs (or
included in the same set_target_properties call) to set CXX_STANDARD 20,
CXX_STANDARD_REQUIRED ON, CUDA_STANDARD 20, CUDA_STANDARD_REQUIRED ON,
POSITION_INDEPENDENT_CODE ON, CXX_VISIBILITY_PRESET hidden, and
VISIBILITY_INLINES_HIDDEN ON for the cuvs-cagra-search target (reference the
existing set_target_properties call for cuvs_objs to mirror or extend the same
property list).
c/src/distance/pairwise_distance.cpp (2)

46-53: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Validate the public C API inputs before dereferencing them.

translate_exceptions won't save a null cuvsResources_t or DLManagedTensor*; this path will crash before it can return a cuvsError_t. Please reject null res, x_tensor, y_tensor, and distances_tensor before touching dl_tensor.
Based on coding guidelines, Input validation must check for negative or invalid dimensions, null pointers, and invalid parameter combinations before GPU operations.

🛡️ Suggested fix
 extern "C" cuvsError_t cuvsPairwiseDistance(cuvsResources_t res,
                                             DLManagedTensor* x_tensor,
                                             DLManagedTensor* y_tensor,
                                             DLManagedTensor* distances_tensor,
                                             cuvsDistanceType metric,
                                             float metric_arg)
 {
   return cuvs::core::translate_exceptions([=] {
+    if (res == nullptr || x_tensor == nullptr || y_tensor == nullptr ||
+        distances_tensor == nullptr) {
+      RAFT_FAIL("cuvsPairwiseDistance requires non-null inputs");
+    }
     auto x_dt    = x_tensor->dl_tensor.dtype;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/distance/pairwise_distance.cpp` around lines 46 - 53, The public C API
cuvsPairwiseDistance must validate inputs before any dereference: add checks at
the start of cuvsPairwiseDistance (before calling
cuvs::core::translate_exceptions or touching dl_tensor) to reject null pointers
for res, x_tensor, y_tensor, and distances_tensor and return an appropriate
cuvsError_t; also validate tensor dimensions/sizes and metric/metric_arg
combinations (negative dims, mismatched shapes, unsupported cuvsDistanceType)
and return errors early so GPU code never runs with invalid inputs. Ensure these
checks reference the same parameter names (cuvsResources_t res, DLManagedTensor*
x_tensor, DLManagedTensor* y_tensor, DLManagedTensor* distances_tensor,
cuvsDistanceType metric, float metric_arg) and occur prior to any use of
dl_tensor or other members.

54-56: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Read dtype metadata from the matching tensors.

y_dt and dist_dt are both taken from x_tensor right now, so mixed-dtype inputs will pass validation and dispatch the wrong specialization.

🐛 Suggested fix
-    auto x_dt    = x_tensor->dl_tensor.dtype;
-    auto y_dt    = x_tensor->dl_tensor.dtype;
-    auto dist_dt = x_tensor->dl_tensor.dtype;
+    auto x_dt    = x_tensor->dl_tensor.dtype;
+    auto y_dt    = y_tensor->dl_tensor.dtype;
+    auto dist_dt = distances_tensor->dl_tensor.dtype;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/src/distance/pairwise_distance.cpp` around lines 54 - 56, The dtype
variables are incorrectly read from x_tensor for all three; change the
assignments so y_dt reads from y_tensor->dl_tensor.dtype and dist_dt reads from
dist_tensor->dl_tensor.dtype (leave x_dt as x_tensor->dl_tensor.dtype) so that
mixed-dtype inputs validate and dispatch correctly for functions/logic that use
x_dt, y_dt, and dist_dt in pairwise_distance.cpp (variables x_tensor, y_tensor,
dist_tensor).
c/include/cuvs/preprocessing/quantize/scalar.h (1)

88-121: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add return documentation to the exported quantizer APIs.

cuvsScalarQuantizerTrain, cuvsScalarQuantizerTransform, and cuvsScalarQuantizerInverseTransform still have parameter docs only. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/preprocessing/quantize/scalar.h` around lines 88 - 121, The
three exported functions cuvsScalarQuantizerTrain, cuvsScalarQuantizerTransform,
and cuvsScalarQuantizerInverseTransform are missing `@return` (and any
side-effect) Doxygen documentation; update each function comment block to
include an `@return` describing the cuvsError_t values returned (e.g.,
CUVS_SUCCESS, error codes) and any side effects (e.g., device/host memory
requirements, in-place modification, ownership of DLManagedTensor pointers), so
the public C API docs are complete and follow the coding guideline.
c/include/cuvs/neighbors/tiered_index.h (1)

214-220: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add the missing @return section to cuvsTieredIndexSearch.

The newly exported search API still lacks return-value documentation, which leaves the public C API docs incomplete. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/tiered_index.h` around lines 214 - 220, Update the
Doxygen comment for the exported function cuvsTieredIndexSearch to include a
`@return` section that documents the cuvsError_t return values (e.g., CUVS_SUCCESS
on success, specific CUVS_* error codes on failure such as invalid arguments,
allocation or search errors) and any side-effects; locate the comment block
above the cuvsTieredIndexSearch declaration and add a concise `@return`
description listing possible return codes and their meaning so the public C API
docs are complete.
c/include/cuvs/preprocessing/quantize/binary.h (1)

104-139: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Add return documentation to the exported binary quantizer APIs.

cuvsBinaryQuantizerTrain, cuvsBinaryQuantizerTransform, and cuvsBinaryQuantizerTransformWithParams still have parameter-only docs. As per coding guidelines: public C API functions must include complete Doxygen documentation describing parameters, return values, and side effects.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/preprocessing/quantize/binary.h` around lines 104 - 139, The
three exported APIs cuvsBinaryQuantizerTrain, cuvsBinaryQuantizerTransform, and
cuvsBinaryQuantizerTransformWithParams are missing `@return` documentation; update
each Doxygen comment to include an `@return` that describes the returned
cuvsError_t values (e.g., CUVS_SUCCESS on success and possible error codes on
failure such as invalid arguments, allocation failures, or
transform/train-specific errors) and note side effects like modifications to
output tensors (for cuvsBinaryQuantizerTransform and
cuvsBinaryQuantizerTransformWithParams the out DLManagedTensor is written, and
for cuvsBinaryQuantizerTrain the quantizer is created/initialized). Ensure the
`@return` text is clear and consistent across the three functions and references
the same error codes used elsewhere in the API.
c/include/cuvs/neighbors/hnsw.h (1)

580-585: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Fix broken deserialize example identifiers.

The sample uses misspelled/invalid symbols (cuvsHnsWIndexParams_t, malformed cuvsHnswDeserialize call), so copy-paste usage from docs will fail.

Suggested doc snippet correction
- * cuvsHnsWIndexParams_t hnsw_params;
+ * cuvsHnswIndexParams_t hnsw_params;
  * cuvsHnswIndexParamsCreate(&hnsw_params);
  * hnsw_params->hierarchy = NONE;
  * hnsw_index->dtype = index->dtype;
- * cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric hnsw_index);
+ * cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric, hnsw_index);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/hnsw.h` around lines 580 - 585, The example doc uses
misspelled/invalid symbols; replace the incorrect type/name and fix the
deserialize call so it matches the public API. Use cuvsHnswIndexParams_t and
cuvsHnswIndexParamsCreate for the params variable (hnsw_params) and call
cuvsHnswDeserialize with the correct argument list and commas (e.g.,
cuvsHnswDeserialize(res, hnsw_params, "/path/to/index", dim, metric,
hnsw_index)) so the example compiles and reflects the real function signature.
c/include/cuvs/neighbors/cagra.h (1)

768-778: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Update the example to match the exported function name.

The docs example calls cuvsCagraSerializeHnswlib, but the public API declaration is cuvsCagraSerializeToHnswlib. This is easy to copy incorrectly and fail at compile time.

Suggested doc fix
- * cuvsCagraSerializeHnswlib(res, "/path/to/index", index);
+ * cuvsCagraSerializeToHnswlib(res, "/path/to/index", index);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/cagra.h` around lines 768 - 778, The example in the
Doxygen block uses the wrong function name; update the sample call to use the
exported API name cuvsCagraSerializeToHnswlib (e.g., replace
cuvsCagraSerializeHnswlib(res, "/path/to/index", index); with
cuvsCagraSerializeToHnswlib(res, "/path/to/index", index);) so the docs match
the declared function cuvsCagraSerializeToHnswlib.
🧹 Nitpick comments (2)
c/include/cuvs/neighbors/hnsw.h (1)

277-282: ⚡ Quick win

Add dedicated Doxygen for cuvsHnswFromCagraWithDataset.

cuvsHnswFromCagraWithDataset is now explicitly exported but has no standalone parameter/return docs, unlike adjacent public APIs.

Proposed doc block pattern
+/**
+ * `@brief` Convert a CAGRA index to an HNSW index using an explicit dataset tensor.
+ *
+ * `@param`[in] res cuvsResources_t opaque C handle
+ * `@param`[in] params cuvsHnswIndexParams_t used to configure conversion
+ * `@param`[in] cagra_index cuvsCagraIndex_t to convert
+ * `@param`[out] hnsw_index cuvsHnswIndex_t output HNSW index
+ * `@param`[in] dataset_tensor DLManagedTensor* dataset used for conversion
+ * `@return` cuvsError_t
+ */
 CUVS_EXPORT cuvsError_t cuvsHnswFromCagraWithDataset(cuvsResources_t res,
                                          cuvsHnswIndexParams_t params,
                                          cuvsCagraIndex_t cagra_index,
                                          cuvsHnswIndex_t hnsw_index,
                                          DLManagedTensor* dataset_tensor);

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/hnsw.h` around lines 277 - 282, Add a dedicated
Doxygen comment block for the public function cuvsHnswFromCagraWithDataset
documenting the purpose and behaviour, listing each parameter (res, params,
cagra_index, hnsw_index, dataset_tensor) with types/semantics (ownership,
whether pointers may be NULL, expected tensor layout/shape/dtype for
dataset_tensor and whether it is consumed or referenced), describe the return
value cuvsError_t and possible error conditions, and include any thread-safety
or precondition notes similar to the adjacent public APIs' doc style; place this
block immediately above the cuvsHnswFromCagraWithDataset declaration so it
appears as standalone Doxygen for the exported symbol.
c/include/cuvs/neighbors/ivf_pq.h (1)

272-290: ⚡ Quick win

Expand getter docs to full Doxygen format (@param/@return).

The newly exported getter APIs in this block currently have brief comments only; please align them with the fully documented style used elsewhere in this header.

As per coding guidelines c/include/cuvs/**/*: “For public C API headers, additionally check: Doxygen documentation for all public functions/classes”.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@c/include/cuvs/neighbors/ivf_pq.h` around lines 272 - 290, The comments for
the public getter APIs (cuvsIvfPqIndexGetNLists, cuvsIvfPqIndexGetDim,
cuvsIvfPqIndexGetSize, cuvsIvfPqIndexGetPqDim, cuvsIvfPqIndexGetPqBits,
cuvsIvfPqIndexGetPqLen) must be expanded into full Doxygen style: add an `@brief`
line, an `@param` for the input index (describe expected handle) and for the
output pointer (describe what value is returned and that it must be non-NULL),
and an `@return` describing the cuvsError_t return values (e.g., CUVS_SUCCESS on
success, error codes on failure). Update each function’s comment block to follow
the same pattern used in other header docs so the public C API is fully
documented.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@c/include/cuvs/neighbors/ivf_flat.h`:
- Around line 159-163: Add full Doxygen comment blocks for the two new public
getters cuvsIvfFlatIndexGetNLists and cuvsIvfFlatIndexGetDim: replace the brief
inline comments with complete blocks that include `@brief` describing purpose,
`@param` for the cuvsIvfFlatIndex_t index and the output pointer (n_lists or dim)
documenting expected ownership/valid pointer requirements, `@return` documenting
the cuvsError_t return values (e.g., CUVS_SUCCESS and possible error codes) and
any side effects (none), and any notes about thread-safety or valid index
states; ensure the function names cuvsIvfFlatIndexGetNLists and
cuvsIvfFlatIndexGetDim appear in the blocks so the API docs pick them up.

In `@cpp/cmake/thirdparty/get_raft.cmake`:
- Around line 9-10: The code currently hardcodes RAFT_FORK and RAFT_PINNED_TAG
in get_raft.cmake (symbols RAFT_FORK and RAFT_PINNED_TAG); remove these fixed
set(...) calls and instead restore upstream defaults or make them configurable:
introduce CMake cache variables (e.g., RAFT_FORK and RAFT_PINNED_TAG via
option/set with CACHE STRING) with defaults pointing to the official upstream
(rapidsai/main or appropriate tag) and allow overrides from the environment or
CI (e.g., honor RAFT_FORK/RAFT_PINNED_TAG CMake cache entries or an env var like
RAFT_FORK_OVERRIDE); alternatively, delete the test pin here and move any
CI-only override into CI config so normal builds resolve upstream. Ensure
references still use the RAFT_FORK and RAFT_PINNED_TAG symbols but no longer
hardcode the test fork/tag in production CMake.

---

Outside diff comments:
In `@c/include/cuvs/neighbors/cagra.h`:
- Around line 768-778: The example in the Doxygen block uses the wrong function
name; update the sample call to use the exported API name
cuvsCagraSerializeToHnswlib (e.g., replace cuvsCagraSerializeHnswlib(res,
"/path/to/index", index); with cuvsCagraSerializeToHnswlib(res,
"/path/to/index", index);) so the docs match the declared function
cuvsCagraSerializeToHnswlib.

In `@c/include/cuvs/neighbors/hnsw.h`:
- Around line 580-585: The example doc uses misspelled/invalid symbols; replace
the incorrect type/name and fix the deserialize call so it matches the public
API. Use cuvsHnswIndexParams_t and cuvsHnswIndexParamsCreate for the params
variable (hnsw_params) and call cuvsHnswDeserialize with the correct argument
list and commas (e.g., cuvsHnswDeserialize(res, hnsw_params, "/path/to/index",
dim, metric, hnsw_index)) so the example compiles and reflects the real function
signature.

In `@c/include/cuvs/neighbors/tiered_index.h`:
- Around line 214-220: Update the Doxygen comment for the exported function
cuvsTieredIndexSearch to include a `@return` section that documents the
cuvsError_t return values (e.g., CUVS_SUCCESS on success, specific CUVS_* error
codes on failure such as invalid arguments, allocation or search errors) and any
side-effects; locate the comment block above the cuvsTieredIndexSearch
declaration and add a concise `@return` description listing possible return codes
and their meaning so the public C API docs are complete.

In `@c/include/cuvs/preprocessing/quantize/binary.h`:
- Around line 104-139: The three exported APIs cuvsBinaryQuantizerTrain,
cuvsBinaryQuantizerTransform, and cuvsBinaryQuantizerTransformWithParams are
missing `@return` documentation; update each Doxygen comment to include an `@return`
that describes the returned cuvsError_t values (e.g., CUVS_SUCCESS on success
and possible error codes on failure such as invalid arguments, allocation
failures, or transform/train-specific errors) and note side effects like
modifications to output tensors (for cuvsBinaryQuantizerTransform and
cuvsBinaryQuantizerTransformWithParams the out DLManagedTensor is written, and
for cuvsBinaryQuantizerTrain the quantizer is created/initialized). Ensure the
`@return` text is clear and consistent across the three functions and references
the same error codes used elsewhere in the API.

In `@c/include/cuvs/preprocessing/quantize/scalar.h`:
- Around line 88-121: The three exported functions cuvsScalarQuantizerTrain,
cuvsScalarQuantizerTransform, and cuvsScalarQuantizerInverseTransform are
missing `@return` (and any side-effect) Doxygen documentation; update each
function comment block to include an `@return` describing the cuvsError_t values
returned (e.g., CUVS_SUCCESS, error codes) and any side effects (e.g.,
device/host memory requirements, in-place modification, ownership of
DLManagedTensor pointers), so the public C API docs are complete and follow the
coding guideline.

In `@c/src/distance/pairwise_distance.cpp`:
- Around line 46-53: The public C API cuvsPairwiseDistance must validate inputs
before any dereference: add checks at the start of cuvsPairwiseDistance (before
calling cuvs::core::translate_exceptions or touching dl_tensor) to reject null
pointers for res, x_tensor, y_tensor, and distances_tensor and return an
appropriate cuvsError_t; also validate tensor dimensions/sizes and
metric/metric_arg combinations (negative dims, mismatched shapes, unsupported
cuvsDistanceType) and return errors early so GPU code never runs with invalid
inputs. Ensure these checks reference the same parameter names (cuvsResources_t
res, DLManagedTensor* x_tensor, DLManagedTensor* y_tensor, DLManagedTensor*
distances_tensor, cuvsDistanceType metric, float metric_arg) and occur prior to
any use of dl_tensor or other members.
- Around line 54-56: The dtype variables are incorrectly read from x_tensor for
all three; change the assignments so y_dt reads from y_tensor->dl_tensor.dtype
and dist_dt reads from dist_tensor->dl_tensor.dtype (leave x_dt as
x_tensor->dl_tensor.dtype) so that mixed-dtype inputs validate and dispatch
correctly for functions/logic that use x_dt, y_dt, and dist_dt in
pairwise_distance.cpp (variables x_tensor, y_tensor, dist_tensor).

In `@cpp/CMakeLists.txt`:
- Around line 931-940: The cuvs-cagra-search target is not receiving the
hidden-visibility defaults, allowing symbol leakage; update the CMake
configuration so cuvs-cagra-search is configured like cuvs_objs (or included in
the same set_target_properties call) to set CXX_STANDARD 20,
CXX_STANDARD_REQUIRED ON, CUDA_STANDARD 20, CUDA_STANDARD_REQUIRED ON,
POSITION_INDEPENDENT_CODE ON, CXX_VISIBILITY_PRESET hidden, and
VISIBILITY_INLINES_HIDDEN ON for the cuvs-cagra-search target (reference the
existing set_target_properties call for cuvs_objs to mirror or extend the same
property list).

---

Nitpick comments:
In `@c/include/cuvs/neighbors/hnsw.h`:
- Around line 277-282: Add a dedicated Doxygen comment block for the public
function cuvsHnswFromCagraWithDataset documenting the purpose and behaviour,
listing each parameter (res, params, cagra_index, hnsw_index, dataset_tensor)
with types/semantics (ownership, whether pointers may be NULL, expected tensor
layout/shape/dtype for dataset_tensor and whether it is consumed or referenced),
describe the return value cuvsError_t and possible error conditions, and include
any thread-safety or precondition notes similar to the adjacent public APIs' doc
style; place this block immediately above the cuvsHnswFromCagraWithDataset
declaration so it appears as standalone Doxygen for the exported symbol.

In `@c/include/cuvs/neighbors/ivf_pq.h`:
- Around line 272-290: The comments for the public getter APIs
(cuvsIvfPqIndexGetNLists, cuvsIvfPqIndexGetDim, cuvsIvfPqIndexGetSize,
cuvsIvfPqIndexGetPqDim, cuvsIvfPqIndexGetPqBits, cuvsIvfPqIndexGetPqLen) must be
expanded into full Doxygen style: add an `@brief` line, an `@param` for the input
index (describe expected handle) and for the output pointer (describe what value
is returned and that it must be non-NULL), and an `@return` describing the
cuvsError_t return values (e.g., CUVS_SUCCESS on success, error codes on
failure). Update each function’s comment block to follow the same pattern used
in other header docs so the public C API is fully documented.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 5ae306dc-74cc-4afc-b172-aa2efd8b1289

📥 Commits

Reviewing files that changed from the base of the PR and between 7cf69cb and 0d94e7d.

📒 Files selected for processing (69)
  • c/CMakeLists.txt
  • c/include/cuvs/cluster/kmeans.h
  • c/include/cuvs/core/c_api.h
  • c/include/cuvs/core/export.h
  • c/include/cuvs/distance/distance.h
  • c/include/cuvs/distance/pairwise_distance.h
  • c/include/cuvs/neighbors/all_neighbors.h
  • c/include/cuvs/neighbors/brute_force.h
  • c/include/cuvs/neighbors/cagra.h
  • c/include/cuvs/neighbors/common.h
  • c/include/cuvs/neighbors/hnsw.h
  • c/include/cuvs/neighbors/ivf_flat.h
  • c/include/cuvs/neighbors/ivf_pq.h
  • c/include/cuvs/neighbors/mg_cagra.h
  • c/include/cuvs/neighbors/mg_common.h
  • c/include/cuvs/neighbors/mg_ivf_flat.h
  • c/include/cuvs/neighbors/mg_ivf_pq.h
  • c/include/cuvs/neighbors/nn_descent.h
  • c/include/cuvs/neighbors/refine.h
  • c/include/cuvs/neighbors/tiered_index.h
  • c/include/cuvs/neighbors/vamana.h
  • c/include/cuvs/preprocessing/pca.h
  • c/include/cuvs/preprocessing/quantize/binary.h
  • c/include/cuvs/preprocessing/quantize/pq.h
  • c/include/cuvs/preprocessing/quantize/scalar.h
  • c/src/distance/pairwise_distance.cpp
  • cpp/CMakeLists.txt
  • cpp/cmake/thirdparty/get_raft.cmake
  • cpp/doxygen/Doxyfile
  • cpp/include/cuvs/cluster/agglomerative.hpp
  • cpp/include/cuvs/cluster/kmeans.hpp
  • cpp/include/cuvs/cluster/spectral.hpp
  • cpp/include/cuvs/core/bitmap.hpp
  • cpp/include/cuvs/core/bitset.hpp
  • cpp/include/cuvs/core/export.hpp
  • cpp/include/cuvs/distance/distance.hpp
  • cpp/include/cuvs/distance/grammian.hpp
  • cpp/include/cuvs/distance/kde.hpp
  • cpp/include/cuvs/neighbors/all_neighbors.hpp
  • cpp/include/cuvs/neighbors/ball_cover.hpp
  • cpp/include/cuvs/neighbors/brute_force.hpp
  • cpp/include/cuvs/neighbors/cagra.hpp
  • cpp/include/cuvs/neighbors/common.hpp
  • cpp/include/cuvs/neighbors/composite/index.hpp
  • cpp/include/cuvs/neighbors/dynamic_batching.hpp
  • cpp/include/cuvs/neighbors/epsilon_neighborhood.hpp
  • cpp/include/cuvs/neighbors/hnsw.hpp
  • cpp/include/cuvs/neighbors/ivf_flat.hpp
  • cpp/include/cuvs/neighbors/ivf_pq.hpp
  • cpp/include/cuvs/neighbors/knn_merge_parts.hpp
  • cpp/include/cuvs/neighbors/nn_descent.hpp
  • cpp/include/cuvs/neighbors/refine.hpp
  • cpp/include/cuvs/neighbors/scann.hpp
  • cpp/include/cuvs/neighbors/tiered_index.hpp
  • cpp/include/cuvs/neighbors/vamana.hpp
  • cpp/include/cuvs/preprocessing/pca.hpp
  • cpp/include/cuvs/preprocessing/quantize/binary.hpp
  • cpp/include/cuvs/preprocessing/quantize/pq.hpp
  • cpp/include/cuvs/preprocessing/quantize/scalar.hpp
  • cpp/include/cuvs/preprocessing/spectral_embedding.hpp
  • cpp/include/cuvs/selection/select_k.hpp
  • cpp/include/cuvs/stats/silhouette_score.hpp
  • cpp/include/cuvs/stats/trustworthiness_score.hpp
  • cpp/include/cuvs/util/cutlass_utils.hpp
  • cpp/include/cuvs/util/file_io.hpp
  • cpp/include/cuvs/util/host_memory.hpp
  • cpp/src/distance/kde.cu
  • cpp/src/neighbors/epsilon_neighborhood.cu
  • cpp/src/neighbors/ivf_flat_index.cpp

Comment thread c/include/cuvs/neighbors/ivf_flat.h Outdated
Comment thread cpp/cmake/thirdparty/get_raft.cmake Outdated
vyasr added 2 commits May 1, 2026 12:05
The ABI checker scans all.h to verify all C headers are listed.
export.h was missing after the rename from export.hpp.
PREDEFINED alone is insufficient because Doxygen parses the source
#define in export.h/hpp and overrides the PREDEFINED value, expanding
CUVS_EXPORT to __attribute__((visibility("default"))) in the XML
output. Excluding the files prevents the redefinition, so the
PREDEFINED CUVS_EXPORT= (expand to nothing) takes effect.
@vyasr
Copy link
Copy Markdown
Contributor Author

vyasr commented May 3, 2026

CI on b3d67e9 shows a passing run (up to the usual flaky failures) on a run containing the raft changes.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cpp/CMakeLists.txt (1)

931-940: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

CUDA_VISIBILITY_PRESET hidden is missing, leaving most compilation uncovered

CXX_VISIBILITY_PRESET only governs the CXX compiler; it does not apply to .cu sources compiled by NVCC. The bulk of cuvs_objs is .cu files, so their host-side symbols are unaffected by the added properties and will remain visible — directly undermining the PR's goal.

The same gap exists for cuvs-cagra-search (lines 333–342): all its sources are .cu files and it has zero visibility settings, yet its objects are merged into the final cuvs shared library alongside cuvs_objs.

CUDA_VISIBILITY_PRESET (available since CMake 3.18, well within this project's 3.30.4 minimum) is the correct property for CUDA sources.

🔧 Proposed fix

For cuvs_objs (lines 931–940), add the CUDA property alongside the CXX one:

 set_target_properties(
   cuvs_objs
   PROPERTIES CXX_STANDARD 20
              CXX_STANDARD_REQUIRED ON
              CUDA_STANDARD 20
              CUDA_STANDARD_REQUIRED ON
              POSITION_INDEPENDENT_CODE ON
              CXX_VISIBILITY_PRESET hidden
+             CUDA_VISIBILITY_PRESET hidden
              VISIBILITY_INLINES_HIDDEN ON
 )

For cuvs-cagra-search (lines 333–342), add the same visibility properties:

 set_target_properties(
   cuvs-cagra-search
   PROPERTIES BUILD_RPATH "\$ORIGIN"
              CXX_STANDARD 20
              CXX_STANDARD_REQUIRED ON
              CUDA_STANDARD 20
              CUDA_STANDARD_REQUIRED ON
              CUDA_SEPARABLE_COMPILATION ON
              POSITION_INDEPENDENT_CODE ON
+             CXX_VISIBILITY_PRESET hidden
+             CUDA_VISIBILITY_PRESET hidden
+             VISIBILITY_INLINES_HIDDEN ON
 )

CMake's <LANG>_VISIBILITY_PRESET determines the value passed in a visibility related compile option, such as -fvisibility= for <LANG> — confirming CXX_VISIBILITY_PRESET and CUDA_VISIBILITY_PRESET are distinct, language-scoped properties.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cpp/CMakeLists.txt` around lines 931 - 940, The CMake properties set for
cuvs_objs (and likewise for the cuvs-cagra-search target) only set
CXX_VISIBILITY_PRESET, which doesn't affect NVCC-compiled .cu sources; add the
CUDA equivalent by setting CUDA_VISIBILITY_PRESET to hidden (and keep
VISIBILITY_INLINES_HIDDEN ON) for both cuvs_objs and cuvs-cagra-search so
host-side symbols from .cu files are compiled with hidden visibility; update the
target property blocks that reference CXX_VISIBILITY_PRESET to also include
CUDA_VISIBILITY_PRESET hidden.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@cpp/CMakeLists.txt`:
- Around line 931-940: The CMake properties set for cuvs_objs (and likewise for
the cuvs-cagra-search target) only set CXX_VISIBILITY_PRESET, which doesn't
affect NVCC-compiled .cu sources; add the CUDA equivalent by setting
CUDA_VISIBILITY_PRESET to hidden (and keep VISIBILITY_INLINES_HIDDEN ON) for
both cuvs_objs and cuvs-cagra-search so host-side symbols from .cu files are
compiled with hidden visibility; update the target property blocks that
reference CXX_VISIBILITY_PRESET to also include CUDA_VISIBILITY_PRESET hidden.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 6c0cfa07-a65e-4d89-860b-15fb758a8ec5

📥 Commits

Reviewing files that changed from the base of the PR and between e11655d and 2f952b3.

📒 Files selected for processing (1)
  • cpp/CMakeLists.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

2 participants