-
Notifications
You must be signed in to change notification settings - Fork 0
Add enVector #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add enVector #1
Conversation
… hnsw and ivvfflat.
Signed-off-by: cutecutecat <[email protected]>
Signed-off-by: Navneet Verma <[email protected]>
…oud results; Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
Added instruction to include custom_case while adding the CLI support for the client.
Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
* Add pgdiskann client * Add CLI support in pgdiskann * add pgdiskann load config in frontend. --------- Co-authored-by: Sheharyar Ahmad <[email protected]>
Signed-off-by: min.tian <[email protected]>
* Added binary quantization support in pgvector hnsw * Parameterized search sql queries. Added distance operator used for reranking, and quantized vector fetch limit in CLI. * remove debug logs * update pgvectorhnsw command option name. * Binary quantization option added in frontend for pgvectorhnsw * remove redundant code * Refactored code * Removed hamming and jaccard distance options for full vectors. Moved reranking_metric to hnsw config class. * refactored code, removed duplicate code. * Reverted code changes for float input type.
Signed-off-by: min.tian <[email protected]>
Signed-off-by: min.tian <[email protected]>
… because ef_search is not a param of ivfflat.
Signed-off-by: yangxuan <[email protected]>
…est first Signed-off-by: min.tian <[email protected]>
Add VCT centroids for ANN
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds EnVector as a new vector database client to VectorDBBench, enabling benchmarking of EnVector's FLAT and IVF_FLAT index types with ANN (Approximate Nearest Neighbor) capabilities including VCT (Virtual Cluster Tree) support.
Key changes:
- Integration of EnVector client with support for FLAT and IVF_FLAT index types
- Addition of scripts and configuration files for running EnVector benchmarks with custom datasets
- Support for VCT-based indexing with pre-trained centroids
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| vectordb_bench/log_util.py | Disables TQDM progress bars via environment variable |
| vectordb_bench/cli/vectordbbench.py | Registers EnVector CLI commands |
| vectordb_bench/backend/clients/envector/envector.py | Core EnVector client implementation with insert/search operations |
| vectordb_bench/backend/clients/envector/config.py | Configuration classes for EnVector FLAT and IVF_FLAT indexes |
| vectordb_bench/backend/clients/envector/cli.py | CLI interface for EnVector benchmarks |
| vectordb_bench/backend/clients/init.py | Registers EnVector in the DB enum and factory methods |
| scripts/run_benchmark.sh | Shell script for running EnVector benchmarks |
| scripts/prepare_dataset.py | Dataset preparation utility for downloading and processing benchmark data |
| scripts/envector_pubmed_config.yml | YAML configuration for PUBMED dataset benchmarks |
| scripts/envector_bloomberg_config.yml | YAML configuration for Bloomberg dataset benchmarks |
| pyproject.toml | Adds required dependencies for EnVector integration |
| README_ENVECTOR.md | Documentation for using EnVector with VectorDBBench |
| README.md | References EnVector documentation |
| .env.example | Updates default environment variable values |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Add enVector