From e2ca694916494e4ec768f4bb1d8e846b76ae137d Mon Sep 17 00:00:00 2001 From: Timofei Sablin Date: Thu, 6 Nov 2025 17:27:32 +0000 Subject: [PATCH] fix: elaborate on configuring search index Follow-up on: https://gramps.discourse.group/t/ai-chat-with-openai-returns-an-error-on-successfull-call/8735/4 It might be better to call "enable" -> "configure" because that's what happens and then actually trigger the rebuild in the web UI --- docs/install_setup/chat.md | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/docs/install_setup/chat.md b/docs/install_setup/chat.md index a1886c46..f4100af2 100644 --- a/docs/install_setup/chat.md +++ b/docs/install_setup/chat.md @@ -15,7 +15,7 @@ In the final step, the question and the retrieved objects are sent to an LLM to To enable the chat endpoint in Gramps Web API, three steps are necessary: 1. Installing required dependencies, -2. Enabling semantic search, +2. Configure semantic search and generate semantic search index, 3. Setting up an LLM provider. The three step are described below in turn. Finally, an owner or administrator must [configure which users can access the chat feature](users.md#configuring-who-can-use-ai-chat) in the Manage Users settings. @@ -33,9 +33,9 @@ pip install gramps_webapi[ai] ``` -## Enabling semantic search +## Configuring semantic search -If the necessary dependencies are installed, enabling semantic search can be as simple as setting the `VECTOR_EMBEDDING_MODEL` configuration option (e.g. by setting the `GRAMPSWEB_VECTOR_EMBEDDING_MODEL` environment variable), see [Server Configuration](configuration.md). This can be any string of a model supported by the [Sentence Transformers](https://sbert.net/) library. See the documentation of this project for details and the available models. +If the necessary dependencies are installed, configuring semantic search can be as simple as setting the `VECTOR_EMBEDDING_MODEL` configuration option (e.g. by setting the `GRAMPSWEB_VECTOR_EMBEDDING_MODEL` environment variable), see [Server Configuration](configuration.md). This can be any string of a model supported by the [Sentence Transformers](https://sbert.net/) library. See the documentation of this project for details and the available models. !!! warning @@ -50,10 +50,12 @@ There are several considerations to make when choosing a model. If the model is not present in the local cache, it will be downloaded when Gramps Web API is started for the first time with the new configuration. The model `sentence-transformers/distiluse-base-multilingual-cased-v2` is already available locally when using the standard docker images. This model is a good starting point and supports multilingual input. +After semantinc search is configured, it is necessary to regenerate the semantic search index in the Administration section. + Please share learnings about different models with the community! !!! info - The sentence transformers library consumes a significant amount of memory, which might cause worker processes being killed. As a rule of thumb, with semantic search enabled, each Gunicorn worker consumes around 200 MB of memory and each celery worker around 500 MB of memory even when idle, and up to 1 GB when computing embeddings. See [Limit CPU and memory usage](cpu-limited.md) for settings that limit memory usage. In addition, it is advisable to provision a sufficiently large swap partition to prevent OOM errors due to transient memory usage spikes. + The sentence transformers library consumes a significant amount of memory, which might cause worker processes being killed. As a rule of thumb, with semantic search configured, each Gunicorn worker consumes around 200 MB of memory and each celery worker around 500 MB of memory even when idle, and up to 1 GB when computing embeddings. See [Limit CPU and memory usage](cpu-limited.md) for settings that limit memory usage. In addition, it is advisable to provision a sufficiently large swap partition to prevent OOM errors due to transient memory usage spikes. ## Setting up an LLM provider @@ -99,4 +101,4 @@ To troubleshoot problems with Ollama, you can enable debug logging by setting en ### Using other providers -Please feel free to submit documentation for other providers and share your experience with the community! \ No newline at end of file +Please feel free to submit documentation for other providers and share your experience with the community!