-
Notifications
You must be signed in to change notification settings - Fork 0
fix(infra): fix some dependency hells and add some lazy loading to reduce celery worker RAM usage #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: eval-pr-5478-target-1758731865359
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 issues found across 6 files
Prompt for AI agents (all 5 issues)
Understand the root cause of the following 5 issues and fix them.
<file name="backend/onyx/context/search/models.py">
<violation number="1" location="backend/onyx/context/search/models.py:360">
The core logic within `chunks_or_sections_to_search_docs` for converting `InferenceChunk` or `InferenceSection.center_chunk` to `SearchDoc` is a direct duplication of `SearchDoc.from_inference_chunk()`. This method should be refactored to reuse `from_inference_chunk` and `from_inference_section`.</violation>
<violation number="2" location="backend/onyx/context/search/models.py:362">
Quoted type annotation reduces clarity and type tool effectiveness; remove quotes for consistency with the rest of the module.
DEV MODE: This violation would have been filtered out by GPT-5.
Reasoning:
• **GPT-5**: Stylistic only; quoted forward refs are used throughout the module and avoid forward-ref evaluation issues. No functional impact.</violation>
<violation number="3" location="backend/onyx/context/search/models.py:379">
Accessing source_links with [0] can raise KeyError; use .get(0) to safely retrieve the zero-index entry when present.</violation>
<violation number="4" location="backend/onyx/context/search/models.py:431">
Accessing source_links with [0] can raise KeyError; replace with .get(0) to avoid exceptions when key 0 is missing.</violation>
</file>
<file name="backend/onyx/server/query_and_chat/query_backend.py">
<violation number="1" location="backend/onyx/server/query_and_chat/query_backend.py:76">
This PR's goal of reducing memory via lazy-loading is undermined by `onyx/llm/utils.py`, which still eagerly imports heavy libraries like `litellm` and `tiktoken` at the top level. This file's import chain causes these libraries to be loaded on startup, negating the memory savings from other lazy-loading efforts.</violation>
</file>
React with 👍 or 👎 to teach cubic. Mention @cubic-dev-ai to give feedback, ask questions, or re-run the review.
| is_internet: bool = False | ||
|
|
||
| @classmethod | ||
| def chunks_or_sections_to_search_docs( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The core logic within chunks_or_sections_to_search_docs for converting InferenceChunk or InferenceSection.center_chunk to SearchDoc is a direct duplication of SearchDoc.from_inference_chunk(). This method should be refactored to reuse from_inference_chunk and from_inference_section.
Prompt for AI agents
Address the following comment on backend/onyx/context/search/models.py at line 360:
<comment>The core logic within `chunks_or_sections_to_search_docs` for converting `InferenceChunk` or `InferenceSection.center_chunk` to `SearchDoc` is a direct duplication of `SearchDoc.from_inference_chunk()`. This method should be refactored to reuse `from_inference_chunk` and `from_inference_section`.</comment>
<file context>
@@ -355,6 +356,97 @@ class SearchDoc(BaseModel):
is_internet: bool = False
+ @classmethod
+ def chunks_or_sections_to_search_docs(
+ cls,
+ items: "Sequence[InferenceChunk | InferenceSection] | None",
</file context>
[internal] Confidence score: 9.5/10
[internal] Posted by: Duplicate Detection Agent
| matching_chunks = document_index.admin_retrieval(query=query, filters=final_filters) | ||
|
|
||
| documents = chunks_or_sections_to_search_docs(matching_chunks) | ||
| documents = SearchDoc.chunks_or_sections_to_search_docs(matching_chunks) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR's goal of reducing memory via lazy-loading is undermined by onyx/llm/utils.py, which still eagerly imports heavy libraries like litellm and tiktoken at the top level. This file's import chain causes these libraries to be loaded on startup, negating the memory savings from other lazy-loading efforts.
Prompt for AI agents
Address the following comment on backend/onyx/server/query_and_chat/query_backend.py at line 76:
<comment>This PR's goal of reducing memory via lazy-loading is undermined by `onyx/llm/utils.py`, which still eagerly imports heavy libraries like `litellm` and `tiktoken` at the top level. This file's import chain causes these libraries to be loaded on startup, negating the memory savings from other lazy-loading efforts.</comment>
<file context>
@@ -74,7 +73,7 @@ def admin_search(
matching_chunks = document_index.admin_retrieval(query=query, filters=final_filters)
- documents = chunks_or_sections_to_search_docs(matching_chunks)
+ documents = SearchDoc.chunks_or_sections_to_search_docs(matching_chunks)
# Deduplicate documents by id
</file context>
[internal] Confidence score: 10/10
[internal] Posted by: System Design Agent
| chunk_ind=inference_chunk.chunk_id, | ||
| semantic_identifier=inference_chunk.semantic_identifier or "Unknown", | ||
| link=( | ||
| inference_chunk.source_links[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing source_links with [0] can raise KeyError; replace with .get(0) to avoid exceptions when key 0 is missing.
Prompt for AI agents
Address the following comment on backend/onyx/context/search/models.py at line 431:
<comment>Accessing source_links with [0] can raise KeyError; replace with .get(0) to avoid exceptions when key 0 is missing.</comment>
<file context>
@@ -355,6 +356,97 @@ class SearchDoc(BaseModel):
+ chunk_ind=inference_chunk.chunk_id,
+ semantic_identifier=inference_chunk.semantic_identifier or "Unknown",
+ link=(
+ inference_chunk.source_links[0]
+ if inference_chunk.source_links
+ else None
</file context>
[internal] Confidence score: 9/10
[internal] Posted by: General AI Review Agent
| ).document_id, | ||
| chunk_ind=chunk.chunk_id, | ||
| semantic_identifier=chunk.semantic_identifier or "Unknown", | ||
| link=chunk.source_links[0] if chunk.source_links else None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accessing source_links with [0] can raise KeyError; use .get(0) to safely retrieve the zero-index entry when present.
Prompt for AI agents
Address the following comment on backend/onyx/context/search/models.py at line 379:
<comment>Accessing source_links with [0] can raise KeyError; use .get(0) to safely retrieve the zero-index entry when present.</comment>
<file context>
@@ -355,6 +356,97 @@ class SearchDoc(BaseModel):
+ ).document_id,
+ chunk_ind=chunk.chunk_id,
+ semantic_identifier=chunk.semantic_identifier or "Unknown",
+ link=chunk.source_links[0] if chunk.source_links else None,
+ blurb=chunk.blurb,
+ source_type=chunk.source_type,
</file context>
[internal] Confidence score: 9/10
[internal] Posted by: General AI Review Agent
| @classmethod | ||
| def chunks_or_sections_to_search_docs( | ||
| cls, | ||
| items: "Sequence[InferenceChunk | InferenceSection] | None", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quoted type annotation reduces clarity and type tool effectiveness; remove quotes for consistency with the rest of the module.
DEV MODE: This violation would have been filtered out by GPT-5.
Reasoning:
• GPT-5: Stylistic only; quoted forward refs are used throughout the module and avoid forward-ref evaluation issues. No functional impact.
Prompt for AI agents
Address the following comment on backend/onyx/context/search/models.py at line 362:
<comment>Quoted type annotation reduces clarity and type tool effectiveness; remove quotes for consistency with the rest of the module.
DEV MODE: This violation would have been filtered out by GPT-5.
Reasoning:
• **GPT-5**: Stylistic only; quoted forward refs are used throughout the module and avoid forward-ref evaluation issues. No functional impact.</comment>
<file context>
@@ -355,6 +356,97 @@ class SearchDoc(BaseModel):
+ @classmethod
+ def chunks_or_sections_to_search_docs(
+ cls,
+ items: "Sequence[InferenceChunk | InferenceSection] | None",
+ ) -> list["SearchDoc"]:
+ """Convert a sequence of InferenceChunk or InferenceSection objects to SearchDoc objects."""
</file context>
[internal] Confidence score: 8/10
[internal] Posted by: General AI Review Agent
https://linear.app/danswer/issue/DAN-2573/move-more-imports-onto-lazy-loading
so i have a script that i run to check memory usage for celery workers,
before this pr its ~600MB per worker
after its ~250 MB for some workers
docker container from 4.3GB -> 2.819GiB
to diagnose why, its not easy - its not that all pip dependencies are loaded into memory at worker start so i can just lazy load any, its specifically ones that get imported at runtime due to an actual import statement
this makes it very tricky to track down exactly what causes the 600 MB. literally had to trial and error suspiscious imports tracing starting from the celery worker main file
TBH the existing repo dependency graph is a little scuffed, one example branch that caused the worker to import llm stuff (there are like a dozen of these had to swift through to get all the memory offenders down):
app base
->
/Users/edwinluo/onyx/backend/onyx/background/celery/tasks/docprocessing/utils.py
->
redis connector
->
/Users/edwinluo/onyx/backend/onyx/redis/redis_connector_delete.py
->
/Users/edwinluo/onyx/backend/onyx/db/document.py
->
/Users/edwinluo/onyx/backend/onyx/db/feedback.py
->
/Users/edwinluo/onyx/backend/onyx/db/chat.py
->
/Users/edwinluo/onyx/backend/onyx/context/search/utils.py
->
/Users/edwinluo/onyx/backend/onyx/db/search_settings.py
->
/Users/edwinluo/onyx/backend/onyx/db/llm.py OR /Users/edwinluo/onyx/backend/onyx/natural_language_processing/search_nlp_models.py
->
/Users/edwinluo/onyx/backend/onyx/llm/utils.py (langchian, litellm, etc.)
How Has This Been Tested?
[Describe the tests you ran to verify your changes]
Backporting (check the box to trigger backport action)
Note: You have to check that the action passes, otherwise resolve the conflicts manually and tag the patches.
Summary by cubic
Moves heavy imports to lazy-loading across indexing, LLM, NLP, and file-processing code to reduce worker memory and speed up startup. Also consolidates search doc conversion into SearchDoc and extracts PromptSnapshot to a shared schema. Addresses Linear DAN-2573 (Reduce Memory usage in Onyx).
Refactors
Migration