This repository contains the code to enable semantic search on the Voxel51 documentation from Python or the command line. The search is powered by FiftyOne, OpenAI's text-embedding-ada-002 model, and Qdrant vector search.
- 2021-06-14: The
fiftyone-docs-searchpackage has been updated in the following ways:- FiftyOne Documentation embeddings have been updated to FiftyOne 0.21.0.
- Splitting of documents is simplified and more robust. LangChain splitters are used in conjunction with our custom Markdown parsing.
- The
block_typeargument has been removed to make search results more robust.
- Clone the repository:
git clone https://github.com/voxel51/fiftyone-docs-search
cd fiftyone-docs-search- Install the package:
pip install -e .- Register your OpenAI API key (create one):
export OPENAI_API_KEY=XXXXXXXX- Launch a Qdrant server:
docker pull qdrant/qdrant
docker run -d -p 6333:6333 qdrant/qdrantThe fiftyone-docs-search package provides a command line interface for
searching the Voxel51 documentation. To use it, run:
fiftyone-docs-search query <query>where <query> is the search query. For example:
fiftyone-docs-search query "how to load a dataset"The following flags can give you control over the search behavior:
--num_results: the number of results returned--open_url: whether to open the top result in your browser--score: whether to return the score of each result--doc_types: the types of docs to search over (e.g., "tutorials", "api", "guides")
You can also use the --help flag to see all available options:
fiftyone-docs-search --helpIf you find fiftyone-docs-search query cumbersome, you can alias the command, by adding the following to your ~/.bashrc or ~/.zshrc file:
alias fosearch='fiftyone-docs-search query'The fiftyone-docs-search package also provides a Python API for searching the
Voxel51 documentation. To use it, run:
from fiftyone.docs_search import FiftyOneDocsSearch
fods = FiftyOneDocsSearch()
results = fods("how to load a dataset")You can set defaults for the search behavior by passing arguments to the constructor:
fods = FiftyOneDocsSearch(
num_results=5,
open_url=True,
score=True,
doc_types=["tutorials", "api", "guides"],
)For any individual search, you can override these defaults by passing arguments.
The fiftyone-docs-search package is versioned to match the version of the
Voxel51 FiftyOne documentation that it is searching. For example, the v0.20.1
version of the fiftyone-docs-search package is designed to search the
v0.20.1 version of the Voxel51 FiftyOne documentation.
By default, if you do not have a Qdrant collection instantiated yet, when you
run a search, the fiftyone-docs-search package will automatically download
a JSON file containing a vector indexing of the latest version of the Voxel51
FiftyOne documentation.
If you would like, you can also build the index yourself from a local copy of the Voxel51 FiftyOne documentation. To do so, first clone the FiftyOne repo if you haven't already:
git clone https://github.com/voxel51/fiftyoneand install FiftyOne, as described in the detailed installation instructions here.
Build a local version of the docs by running:
bash docs/generate_docs.bashThen, set a FIFTYONE_DIR environment variable to the path to the local
FiftyOne repo. For example, if you cloned the repo to ~/fiftyone, you would
run:
export FIFTYONE_DIR=~/fiftyoneFinally, run the following command to build the index:
fiftyone-docs-search createIf you would like to save the Qdrant index to JSON, you can run:
fiftyone-docs-search save -o <path to JSON file>Contributions are welcome!
If you've made it this far, we'd greatly appreciate if you'd take a moment to check out FiftyOne and give us a star!
FiftyOne is an open source library for building high-quality datasets and computer vision models. It's the engine that powers this project.
Thanks for visiting! 😊
If you want join a fast-growing community of engineers, researchers, and practitioners who love computer vision, join the FiftyOne Slack community! 🚀🚀🚀

