Fix memory exhaustion caused by circular references in PDF xref chains by k00ni · Pull Request #787 · smalot/pdfparser

k00ni · 2025-11-24T08:02:28Z

Type of pull request

Bug fix (involves code and configuration changes)

About

This PR fixes xref-handling when it comes to circular references. A malformed PDF can lead to memory exhaustion because PDFParser enters an endless loop at some point. These fixes should prevent that.

It allows me to work with a bunch of "broken" PDFs, but I might have overlook some things. It would be great if someone else could have a look. Any feedback is appreciated (maybe @GreyWyvern @j0k3r?) 🚀

#2) * Initial plan * Initial setup: Add crash file and reproducer script Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com> * Fix infinite loop caused by circular xref references Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com> * Add test case for circular xref reference fix Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com> * Address review feedback: Add array<int> type hints and remove test - Added array<int> type hints to $visitedOffsets parameters in PHPDoc comments - Removed testCircularXrefReference test as requested by reviewer - Removed reproducer.php file as requested - Crash file memory-exhaustion-crash-f5d89649.txt remains permanently in samples/bugs/ Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com> * Remove crash file and add comprehensive tests for circular reference fix - Removed memory-exhaustion-crash-f5d89649.txt as requested - Added helper methods to expose protected functions for testing - Added testGetXrefDataPreventsCircularReferences: Tests that visitedOffsets prevents infinite loops - Added testDecodeXrefPassesVisitedOffsets: Tests that decodeXref correctly passes visitedOffsets - Added testGetXrefDataTracksVisitedOffsets: Tests that offsets are tracked properly - All tests use minimal PDF structures and don't rely on external crash file Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com> --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com> Co-authored-by: k00ni <381727+k00ni@users.noreply.github.com>

GreyWyvern · 2025-11-24T19:02:32Z

I haven't checked if it solves #71, but the PR code worked fine, without error on my main search engine implementation (464 PDFs).

Then again, I'm pretty sure none of those PDFs were of the malformed type to have caused this error.

k00ni · 2025-11-25T07:50:10Z

Then again, I'm pretty sure none of those PDFs were of the malformed type to have caused this error.

You would have noticed 😅

Thank you very much for taking the time.

j0k3r

Looks ok to me

k00ni · 2026-01-08T08:21:30Z

@j0k3r I had to release https://github.com/smalot/pdfparser/releases/tag/v2.12.3 without asking you because it contains a fix for a Denial of Service vulnerability and I was not sure who else can see the release-draft. I hope it was OK.

k00ni self-assigned this Nov 24, 2025

k00ni added fix help wanted labels Nov 24, 2025

k00ni mentioned this pull request Nov 24, 2025

Stops with no error indication when low memory limit #71

Open

Removed PHP-CS-Fixer issues

c193c32

k00ni requested a review from j0k3r December 29, 2025 08:01

j0k3r approved these changes Jan 5, 2026

View reviewed changes

k00ni merged commit 61c9bca into smalot:master Jan 8, 2026
36 checks passed

k00ni deleted the memory-exhaustion-circular-references-xref branch January 8, 2026 08:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix memory exhaustion caused by circular references in PDF xref chains#787

Fix memory exhaustion caused by circular references in PDF xref chains#787
k00ni merged 2 commits intosmalot:masterfrom
k00ni:memory-exhaustion-circular-references-xref

k00ni commented Nov 24, 2025 •

edited

Loading

Uh oh!

GreyWyvern commented Nov 24, 2025

Uh oh!

k00ni commented Nov 25, 2025

Uh oh!

j0k3r left a comment

Uh oh!

Uh oh!

k00ni commented Jan 8, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

Conversation

k00ni commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Type of pull request

About

Uh oh!

GreyWyvern commented Nov 24, 2025

Uh oh!

k00ni commented Nov 25, 2025

Uh oh!

j0k3r left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

k00ni commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Comments

k00ni commented Nov 24, 2025 •

edited

Loading

k00ni commented Jan 8, 2026 •

edited

Loading