Skip to content

Conversation

@steel-bucket
Copy link
Contributor

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

This PR resolves one TODO and one FIXME in the avc_functions.c.
The first TODO was - TODO: Do something if newsize == -1 (broken NAL)
So, what my changes do is, print out a more verbose error message.
The first FIXME was to check the set_fts function for CCX_NAL_TYPE_SEI which was only being checked in slice_header for NAL Type as CCX_NAL_TYPE_CODED_SLICE_IDR_PICTURE or CCX_NAL_TYPE_CODED_SLICE_NON_IDR_PICTURE_1.

There are more TODOs in the AVC Functions library, which need to be resolved before or during the porting to Rust.
This PR will help in debugging and resolving the AVC Issues currently prevalent in CCExtractor.( #1626 , #1597 , #1592 )

@prateekmedia prateekmedia requested a review from Copilot July 15, 2025 19:53
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses two specific items in the AVC functions library: a TODO comment about handling broken NAL units and a FIXME comment about calling set_fts for SEI NAL units. The changes improve error handling for corrupted AVC/H.264 streams and ensure proper timestamp handling for SEI units.

  • Enhanced error messaging for corrupted NAL units with detailed explanations
  • Enabled set_fts function call for SEI NAL units to fix timestamp handling
  • Updated changelog to document the fixes

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/lib_ccx/avc_functions.c Improved error message for corrupted NAL units and enabled set_fts call for SEI units
docs/CHANGES.TXT Added changelog entry documenting the TODO and FIXME fixes

Comment on lines +100 to +103
mprint("\rWarning: Invalid prevention bytes detected in NAL unit type %u (0x%02X). "
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This may indicate a corrupted AVC/H.264 stream. NAL unit skipped.\n",
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message incorrectly describes the failure condition. The remove_03emu function can fail for reasons other than invalid prevention bytes, such as general parsing errors or memory issues. Consider using a more generic message like 'Failed to process NAL unit type %u (0x%02X) due to emulation prevention byte removal failure.'

Suggested change
mprint("\rWarning: Invalid prevention bytes detected in NAL unit type %u (0x%02X). "
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This may indicate a corrupted AVC/H.264 stream. NAL unit skipped.\n",
mprint("\rWarning: Failed to process NAL unit type %u (0x%02X) due to emulation prevention byte removal failure. "
"This may indicate a corrupted AVC/H.264 stream or an internal processing error. NAL unit skipped.\n",

Copilot uses AI. Check for mistakes.
Comment on lines +101 to +102
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message contains technical inaccuracies. The sequences 0x000000, 0x000001, and 0x000002 are not inherently illegal in H.264 streams - 0x000001 is actually the standard start code prefix. The message should focus on the actual failure rather than speculating about specific byte patterns.

Suggested change
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This NAL unit contains improper prevention bytes (0x03) or other anomalies. "

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

@steel-bucket steel-bucket Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this suggestion is incorrect, The comment at Line 274 states that 0x000000, 0x000001 or 0x000002 shall not occur at any byte-aligned position

Comment on lines 104 to +110
return;
}

dvprint("BEGIN NAL unit type: %d length %d ref_idc: %d - Buffered captions before: %d\n",
nal_unit_type, NAL_stop - NAL_start - 1, dec_ctx->avc_ctx->nal_ref_idc, !dec_ctx->avc_ctx->cc_buffer_saved);

Copy link

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the dvprint statement after the error check means debug information won't be printed for NAL units that fail emulation prevention byte removal. This could make debugging more difficult. Consider keeping the debug print before the error check or adding a separate debug message for failed NAL units.

Suggested change
return;
}
dvprint("BEGIN NAL unit type: %d length %d ref_idc: %d - Buffered captions before: %d\n",
nal_unit_type, NAL_stop - NAL_start - 1, dec_ctx->avc_ctx->nal_ref_idc, !dec_ctx->avc_ctx->cc_buffer_saved);
dvprint("Failed to process NAL unit type: %d - Emulation prevention byte removal failed.\n",
nal_unit_type);
return;
}

Copilot uses AI. Check for mistakes.
@steel-bucket
Copy link
Contributor Author

@prateekmedia I have AVC Functions in 11th week of my Plan, should we scrap this PR as the code will be redundant anyway?

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 1c7e2a0...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 86/86
Teletext 21/21
WTV 13/13
XDS 33/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla e274a73653...

Congratulations: Merging this PR would fix the following tests:


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 1c7e2a0...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 5/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 86/86
Teletext 7/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --hardsubx 1a0302f7fd..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --xds --latin1 --ucla e274a73653..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants