Skip to content

feat(filter): add batch filtering and confidence calibration#25

Merged
foundatron merged 1 commit intomainfrom
issue-7
Mar 11, 2026
Merged

feat(filter): add batch filtering and confidence calibration#25
foundatron merged 1 commit intomainfrom
issue-7

Conversation

@foundatron
Copy link
Owner

Closes #7

Changes

tentacle/llm/prompts.py — Add FILTER_BATCH_SYSTEM and FILTER_BATCH_USER prompt templates

  • FILTER_BATCH_SYSTEM: Instructs the LLM to score multiple articles, returning a JSON array of {"index": N, "relevance": 0.XX, "reasoning": "..."} with 1-based indices.
  • FILTER_BATCH_USER: Formats a numbered list of title+abstract pairs ([1] Title: ... / Abstract: ...).

tentacle/llm/filter.py — Add filter_batch() function; filter_article() unchanged

  • filter_batch(client, articles, *, model, threshold, batch_size=10) -> list[tuple[float, str]]: Chunks articles into batches of batch_size, sends each batch in a single LLM call, parses JSON array response.
  • On JSON parse failure for entire batch: fall back to filter_article() individually for every article in that batch.
  • On partial parse failure (valid JSON but missing/malformed/out-of-range entries): use parsed results for successful entries, fall back to filter_article() for missing ones.
  • Returns results in input order.
  • Set max_tokens proportionally (e.g., batch_size * 100).
  • Log a warning on any fallback so degraded batches are visible in production logs.

tentacle/cli.py — Update _run_scan() to use filter_batch()

  • Replace the per-article filter_article() loop with a filter_batch() call over new_articles, then build relevant_articles from the results using the same threshold check.

Review Findings

  • Errors: 0
  • Warnings: 3
  • Nits: 4
  • Assessment: NEEDS CHANGES

The most impactful fix is #4 (clamp relevance scores in the batch path to match filter_article behavior). #2 (token budget) is a latent reliability issue that will cause silent cost waste. #1 is worth a defensive assertion. The code is otherwise well-structured with solid test coverage and good fallback design.

Replace per-article filter_article() loop in cli.py with filter_batch(),
which sends article batches in a single LLM call. Falls back to individual
filter_article() calls on full JSON parse failure or missing/out-of-range
entries. Uses 1-based indices in prompts for LLM reliability.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@foundatron foundatron merged commit 3d5b909 into main Mar 11, 2026
1 check passed
@foundatron foundatron deleted the issue-7 branch March 11, 2026 02:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(filter): add batch filtering and confidence calibration

1 participant