Conversation
📝 WalkthroughWalkthroughThe POST /update/wikidata endpoint was refactored to queue an asynchronous Prefect workflow for batch QIDs. A new service trigger starts the Prefect deployment and returns 202; Prefect tasks perform per-QID updates and report individual outcomes. Changes
Sequence DiagramsequenceDiagram
participant Client as Client
participant Flask as Flask Endpoint
participant Service as Import Service
participant Prefect as Prefect Deployment
participant Task as update_wikidata_batch Task
participant Importer as WikidataImporter
Client->>Flask: POST /update/wikidata {qids: [...]}
Flask->>Flask: parse & validate qids
Flask->>Service: trigger_update_wikidata_async(qids)
Service->>Prefect: run_deployment(action: "update/wikidata", qids)
Prefect-->>Service: {deployment_id, id, flow_id, ...}
Service-->>Flask: accepted payload
Flask-->>Client: 202 + Prefect payload
Note over Prefect,Task: async background processing
Prefect->>Task: invoke update_wikidata_batch(qids)
Task->>Task: set env & instantiate Importer
loop for each qid
Task->>Importer: update_entities(qid)
Importer-->>Task: result / error
Task->>Task: record per-QID status
end
Task-->>Prefect: {results: {...}, all_imported: bool}
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 1 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
services/import_service.py (1)
283-310: Docstring says "import" but function performs "update".Line 287 states "Trigger a Prefect Wikidata import workflow" but this function triggers an update workflow. The docstring should be corrected to match the function's purpose.
Additionally, this function is nearly identical to
trigger_wikidata_async(lines 254-281), differing only in theactionparameter and responsemessage. Consider extracting a shared helper to reduce duplication.🔧 Proposed docstring fix
def trigger_update_wikidata_async( qids: list[str], workflow_name: str = DEFAULT_WORKFLOW_NAME, ) -> dict: - """Trigger a Prefect Wikidata import workflow. + """Trigger a Prefect Wikidata update workflow. Args: qids: List of Wikidata QIDs to update.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/import_service.py` around lines 283 - 310, The docstring for trigger_update_wikidata_async is incorrect and should say it triggers a Prefect Wikidata update workflow; update the docstring text in trigger_update_wikidata_async to reflect "update" rather than "import", and make the function share logic with trigger_wikidata_async by extracting a helper (e.g., _trigger_wikidata_deployment or run_wikidata_deployment) that accepts parameters like action, qids, workflow_name and returns the same payload structure so both trigger_update_wikidata_async and trigger_wikidata_async call that helper (preserve unique fields like the response message by allowing a message param).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@flask_app/app.py`:
- Around line 212-226: Fix the typos and inconsistent wording in
update_wikidata_async: correct the docstring to remove the stray backtick and
'a' so it reads that the JSON expects a "qids" field (and remove trailing extra
whitespace), and change the error response message from "missing qid" to
"missing qids" so it matches the normalized variable qids and the docstring.
Ensure the docstring clearly states qids can be a list or comma/space-separated
string.
In `@prefect_workflow/prefect_mardi_importer.py`:
- Around line 259-262: The failure check is looking for
result.get("all_imported", True) but update_wikidata_batch returns
"all_updated", so change the flow's check to use the correct key (e.g.,
result.get("all_updated", False)) or standardize to a common key (e.g., "all_ok"
or "success") across batch tasks; update the logic in the flow where result is
inspected (the block handling update_wikidata_batch and the similar checks at
lines referenced) so it reads the matching key returned by update_wikidata_batch
(or adapt update_wikidata_batch to return the standardized key) and ensure the
default value causes the flow to fail when the key is missing/false.
---
Nitpick comments:
In `@services/import_service.py`:
- Around line 283-310: The docstring for trigger_update_wikidata_async is
incorrect and should say it triggers a Prefect Wikidata update workflow; update
the docstring text in trigger_update_wikidata_async to reflect "update" rather
than "import", and make the function share logic with trigger_wikidata_async by
extracting a helper (e.g., _trigger_wikidata_deployment or
run_wikidata_deployment) that accepts parameters like action, qids,
workflow_name and returns the same payload structure so both
trigger_update_wikidata_async and trigger_wikidata_async call that helper
(preserve unique fields like the response message by allowing a message param).
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 6e17f789-863a-4e29-a193-6a40c89a8ca0
📒 Files selected for processing (3)
flask_app/app.pyprefect_workflow/prefect_mardi_importer.pyservices/import_service.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
flask_app/app.py (1)
212-219:⚠️ Potential issue | 🟡 MinorDocstring is outdated and contains typos.
- Line 213: "Update person profile" is too narrow – this now handles batch QIDs for any Wikidata entity
- Line 215: Double comma ", ," before "which"
- Line 216: Trailing whitespace
- Lines 218-219: Return description mentions "QID that was updated or an empty list" but the endpoint now returns a Prefect trigger payload with HTTP 202
,
✏️ Proposed fix
`@app.post`("/update/wikidata") def update_wikidata_async(): - """Update person profile from wikidata; this is async and happens in Prefect + """Update Wikidata entities asynchronously via Prefect. - Expects JSON with a ``qids`` field, , which may be a list or a string of - comma/space-separated Wikidata QIDs. + Expects JSON with a ``qids`` field, which may be a list or a string of + comma/space-separated Wikidata QIDs. Returns: - Response, either the QID that was updated or an empty list + Flask response tuple with Prefect flow metadata and HTTP 202. """🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@flask_app/app.py` around lines 212 - 219, The docstring for update_wikidata_async is outdated and contains typos; update it to describe that the function handles batch Wikidata QIDs for any entity (not just person), fix the punctuation/whitespace (remove the double comma and trailing space), clarify the input format ("qids" can be a list or a comma/space-separated string of QIDs), and replace the return description with the current behavior stating the endpoint triggers a Prefect flow and returns a Prefect trigger payload with HTTP 202; keep all changes inside the update_wikidata_async docstring.
🧹 Nitpick comments (1)
flask_app/app.py (1)
226-226: Minor: Log message uses singular "QID" for a list.For consistency and clarity, consider using "QIDs" (plural) in the log message.
✏️ Proposed fix
- log.info("QID: %s", qids) + log.info("QIDs: %s", qids)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@flask_app/app.py` at line 226, The log statement using log.info("QID: %s", qids) is misleading because qids is a list; update the message to use the plural form (e.g., "QIDs") so it accurately reflects the data. Locate the log.info call in flask_app/app.py that references qids and change the message text from "QID: %s" to "QIDs: %s" (preserving the qids variable and formatting) to improve clarity.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@flask_app/app.py`:
- Around line 212-219: The docstring for update_wikidata_async is outdated and
contains typos; update it to describe that the function handles batch Wikidata
QIDs for any entity (not just person), fix the punctuation/whitespace (remove
the double comma and trailing space), clarify the input format ("qids" can be a
list or a comma/space-separated string of QIDs), and replace the return
description with the current behavior stating the endpoint triggers a Prefect
flow and returns a Prefect trigger payload with HTTP 202; keep all changes
inside the update_wikidata_async docstring.
---
Nitpick comments:
In `@flask_app/app.py`:
- Line 226: The log statement using log.info("QID: %s", qids) is misleading
because qids is a list; update the message to use the plural form (e.g., "QIDs")
so it accurately reflects the data. Locate the log.info call in flask_app/app.py
that references qids and change the message text from "QID: %s" to "QIDs: %s"
(preserving the qids variable and formatting) to improve clarity.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 11bed487-3786-4694-8a4b-c86e1db839c1
📒 Files selected for processing (2)
flask_app/app.pyprefect_workflow/prefect_mardi_importer.py
Connect wikidata update functionality to prefect
Summary by CodeRabbit