Skip to content

Conversation

@jamesbraza
Copy link
Collaborator

@jamesbraza jamesbraza commented Nov 10, 2025

Missed these in #1121


Note

Expose a configurable Docling PDF backend in the reader and update CI to build, test, and publish the paper-qa-docling package.

  • Docling reader (packages/paper-qa-docling/src/paperqa_docling/reader.py):
    • Add backend arg to parse_pdf_to_pages (default: DoclingParseV4DocumentBackend), pass through to PdfFormatOption.
    • Include backend in output ParsedMetadata.name.
    • Minor typing import adjustments.
  • CI:
    • Build/Publish (.github/workflows/build.yml): Build, download, and clean packages/paper-qa-docling artifact alongside existing packages.
    • Lint/Test (.github/workflows/tests.yml): Add build/check and cleanup steps for paper-qa-docling in Python 3.11 matrix.

Written by Cursor Bugbot for commit b7031e1. Configure here.

@jamesbraza jamesbraza self-assigned this Nov 10, 2025
Copilot AI review requested due to automatic review settings November 10, 2025 19:44
@jamesbraza jamesbraza added the bug Something isn't working label Nov 10, 2025
@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Nov 10, 2025
@dosubot
Copy link

dosubot bot commented Nov 10, 2025

Related Documentation

Checked 1 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the enhancement New feature or request label Nov 10, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR exposes the backend parameter in the parse_pdf_to_pages function for the paper-qa-docling package and adds the package to CI linting and publishing workflows.

  • Adds backend parameter to allow customization of the PDF parsing backend
  • Includes paper-qa-docling in both test and build CI workflows
  • Updates metadata to include backend information for better tracking

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
packages/paper-qa-docling/src/paperqa_docling/reader.py Exposes backend parameter with proper type hints, passes it to PdfFormatOption, and includes it in metadata name
.github/workflows/tests.yml Adds paper-qa-docling package build and inspection to the lint job
.github/workflows/build.yml Adds paper-qa-docling package to the publish workflow for distribution

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@jamesbraza jamesbraza merged commit 607b350 into main Nov 10, 2025
7 checks passed
@jamesbraza jamesbraza deleted the better-docling branch November 10, 2025 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working enhancement New feature or request size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants