Skip to content

[codex] Harden PKCS#7 and certificate detection#68

Merged
PrzemyslawKlys merged 5 commits intomasterfrom
codex/fileinspectorx-pkcs7-hardening
Apr 2, 2026
Merged

[codex] Harden PKCS#7 and certificate detection#68
PrzemyslawKlys merged 5 commits intomasterfrom
codex/fileinspectorx-pkcs7-hardening

Conversation

@PrzemyslawKlys
Copy link
Copy Markdown
Member

What changed

  • replace the loose PKCS#7 and PKCS#12 OID substring matches with top-level ASN.1 shape checks
  • preserve declared PKCS#7 subtypes such as .spc and .p7s while keeping bounded certificate parsing on the analysis path
  • tighten DER X.509 detection to validate the certificate envelope structure instead of matching algorithm OIDs anywhere near the front of the blob
  • add regression coverage for valid .p7b/.spc/.p7s and .p12 files, oversized .p7b budget skips, and multiple malformed ASN.1 lookalikes

Why

TierBridge calls FileInspectorX on the service hot path, so the PKCS#7 bundle path needed to stay inside the existing read budget and to classify crypto containers precisely. This PR hardens the detector behavior around PKCS#7, PKCS#12, and DER certificates so we reduce both avoidable memory risk and ASN.1 false positives.

Validation

  • dotnet test FileInspectorX.Tests/FileInspectorX.Tests.csproj --nologo --filter "FullyQualifiedName~CryptoDetectionsTests"
  • dotnet test FileInspectorX.Tests/FileInspectorX.Tests.csproj --nologo

@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

PR Review: Harden PKCS#7 and certificate detection

Overall this is a solid improvement — replacing OID substring scanning with proper top-level ASN.1 structural validation eliminates a real class of false positives, and the bounded-read protection is the right call for a hot-path library. A few things worth addressing:


Potential Issues

1. src.ToArray() allocation in detection hot path (Signatures.Crypto.cs)

All three TryMatch* methods call src.ToArray() to construct an AsnReader, allocating a new heap array on every invocation even when detection fails quickly. Since these run inside the per-file detection pipeline, this adds GC pressure for bulk workloads.

On .NET 6+, AsnReader accepts ReadOnlyMemory<byte>, which avoids the copy when the caller already holds a Memory<byte> slice. If the src span originates from a byte[] (which it appears to), you can use MemoryMarshal.CreateFromPinnedArray or thread Memory<byte> through the detection chain instead of ReadOnlySpan<byte>. At minimum, add a src.Length < N short-circuit before the ToArray() call to avoid allocating on obviously-too-small inputs (the < 16 guard helps, but the allocation still happens before AsnReader can reject further).

2. Duplicate spc registration in Maps.cs

spc is added to both MimeMaps.Default (line ~762) and ExtraMime.Crypto. Since TryGetByExtension checks Default first and returns early, the ExtraMime.Crypto entry for spc is dead code. Remove one of them to avoid confusion about which is authoritative.

3. Redundant preprocessor condition

#if NET5_0_OR_GREATER || NET8_0_OR_GREATER

NET8_0_OR_GREATER implies NET5_0_OR_GREATER, so the || NET8_0_OR_GREATER term is never reached. This appears in both the production code (TryLoadCertificateFromFile) and the test file. Just #if NET5_0_OR_GREATER is sufficient.

4. Test isolation: shared mutable Settings.DetectionReadBudgetBytes

Several tests (e.g. Analyze_P7bCertificateBundle_ParsesWithinBudget, Analyze_PemCertificate_UsesBoundedHeadRead) mutate FileInspectorX.Settings.DetectionReadBudgetBytes and restore it in a finally block. xUnit runs test classes in parallel by default, so concurrent tests in the same class can observe each other's temporary budget values, leading to flaky failures under parallel runs.

Options: disable parallelism on this class with [Collection("sequential")], wrap mutations in a lock, or — cleaner — accept the budget as a parameter in the method under test so tests can pass it directly without touching global state.


Minor Observations

5. Real RSA 2048-bit key generation in unit tests

Tests like Detect_Pkcs12_By_TopLevel_Pfx_Shape and Detect_Der_X509_By_CertificateShape generate real 2048-bit RSA key pairs. This is 10–100× slower than crafting a minimal valid byte fixture. Consider using pre-generated small test vectors for the detector unit tests, reserving live crypto generation for integration-level tests that actually need a valid certificate chain.

6. TryLoadCertificateFromFile fall-through for .crt/.cer

When TryReadPemCertificateBlock fails for a crt/cer file, the code silently falls through to the TryReadFileBytesWithinBudget DER path. This is the right behavior for DER-encoded .cer files, but the control flow is non-obvious — the empty if (pem == false && ext != "pem") { /* fall through */ } branch has no comment explaining it intentionally drops into DER parsing. A short comment here would help maintainers.


What's Done Well

  • The PKCS#12 version check (version < 3) and ContentInfo OID exact-match (1.2.840.113549.1.7.2) before checking for the explicit [0] wrapper is the correct structural test per RFC 7292.
  • TryMatchDerCertificate cross-checking tbsSignatureOid == certificateSignatureOid follows RFC 5280 §4.1.1.2 and meaningfully filters lookalike structures.
  • The NormalizeMime fix to use MimeMaps.TryGetByExtension instead of MimeMaps.Default directly ensures crypto extensions like p7s/spc get the right MIME type when normalizing from application/octet-stream.
  • Negative test cases for each detector (OID not at top level, missing wrapper, incomplete envelope) are exactly the right regression coverage.
  • Budget-bounded reading in TryParseP7b is the correct fix for the memory concern on the hot path.

🤖 Generated with Claude Code

@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

PR Review: Harden PKCS#7 and Certificate Detection

Overall this is a solid, well-motivated change. Replacing raw OID-byte window scanning with proper top-level ASN.1 structure validation is the right approach for a security-adjacent library, and the test overhaul is a clear improvement. The following are observations across correctness, code quality, and tests.


Correctness Issues

PKCS#12 version check allows non-standard versions

Signatures.Crypto.csTryMatchPkcs12:

if (version < 3)
{
    return false;
}

RFC 7292 §4 states the PFX version field MUST be v3 (integer value 3). The current guard accepts any version ≥ 3 (e.g., crafted input with version = 100 would pass). Should be:

if (version != 3)
{
    return false;
}

WriteSyntheticOleDirectoryFile has inconsistent sector addresses

DetectorTests.cs — new helper method:

WriteLe32(header, 0x2C, 1);   // total FAT sectors = 1
WriteLe32(header, 0x30, 2);   // first directory sector = sector 2 → offset 1536
WriteLe32(header, 0x4C, 0);   // DIFAT[0] (first FAT sector) = sector 0 → offset 512
...
fs.Position = 1024;            // FAT written here → sector 1
fs.Write(fat, 0, fat.Length);
fs.Position = 2048;            // directory written here → sector 3
fs.Write(dir, 0, dir.Length);

With a 512-byte sector size, sector N begins at offset 512 + N × 512:

  • The header says FAT is at sector 0 (offset 512), but the data is at offset 1024 (sector 1).
  • The header says the directory is at sector 2 (offset 1536), but the data is at offset 2048 (sector 3).

If TryGetOleDirectoryNames follows the sector chain from the header, this file is structurally broken and the test could be a false green. Worth verifying the function actually navigates via sector addresses rather than scanning the raw stream, and adjusting the offsets to be consistent:

WriteLe32(header, 0x30, 3);   // directory at sector 3 → offset 2048
WriteLe32(header, 0x4C, 1);   // FAT at sector 1 → offset 1024

Code Quality / Design Notes

Unnecessary array copies in span overloads

internal static bool TryMatchPkcs12(ReadOnlySpan<byte> src, out ContentTypeDetectionResult? result)
    => TryMatchPkcs12(new ReadOnlyMemory<byte>(src.ToArray()), out result);

src.ToArray() allocates a full copy on every call through the span path. This happens in the Detect(byte[]) overload (the data-only path). Since this is inside the detection hot path in TierBridge's service loop, it's worth noting. The allocation is bounded by Settings.HeaderReadBytes (default 4 KB), so it's not a memory safety concern — just a minor unnecessary allocation. If AsnReader accepted ReadOnlySpan<byte> this could be avoided, but it requires ReadOnlyMemory<byte>, so the copy is currently unavoidable. A comment noting this trade-off would help future readers.


TryReadPfxAuthSafe consumes MacData optimistically

if (pfx.HasData)
{
    pfx.ReadEncodedValue();
}
return !pfx.HasData;

This consumes at most one optional element after AuthSafe. PKCS#12 PFX is SEQUENCE { version INTEGER, authSafe ContentInfo, macData MacData OPTIONAL }, so at most one trailing element is expected. The implementation is correct but the comment at the return site return !pfx.HasData could clarify why a second trailing value would cause a reject (indicating a non-conformant structure).


DetectionSettingsCollection disables parallelism only for CryptoDetectionsTests

If any other test class in the suite ever modifies Settings.DetectionReadBudgetBytes, it would race with these tests. The safest approach is to reset the budget inside a fixture's Dispose or to scope the settings mutation to a per-test value (e.g., via a thread-local or scoped settings object). Not an immediate problem, but worth tracking as the suite grows.


Tests: What's Good

  • Replacing synthetic byte blobs with real .NET cryptographic objects (RSA + CertificateRequest) for positive tests is a significant reliability improvement.
  • The negative fixture files (pkcs7-signeddata-oid-lookalike.p7b, der-cert-rsa-oid-lookalike.cer, etc.) directly exercise the false-positive scenarios described in the PR description.
  • Budget enforcement tests (SkipsParsingWhenFileExceedsBudget) are exactly right for a hot-path service.
  • Using [Collection(nameof(DetectionSettingsCollection), DisableParallelization = true)] to serialize global-state-mutating tests is the correct pattern.

Summary

Severity Location Issue
Bug Signatures.Crypto.cs TryMatchPkcs12 version < 3 should be version != 3 per RFC 7292
Bug (needs verification) DetectorTests.cs WriteSyntheticOleDirectoryFile Sector address values don't match the write offsets
Minor Span overloads in Asn1DetectionHelpers / Signatures.Crypto.cs Unnecessary ToArray() allocations; acceptable but worth a comment
Design note DetectionSettingsCollection Scope is narrow; other settings-mutating tests could race in the future

The core detection logic change is well-reasoned and the test coverage is substantially better than what it replaces. Fixing the version check and verifying the OLE sector helper are the two items I'd want resolved before merging.

@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

PR Review: Harden PKCS#7 and certificate detection

Overall this is a solid, well-motivated PR. The move from OID substring scanning to proper top-level ASN.1 shape verification is the right architectural direction, and the regression coverage is comprehensive. A few items worth addressing before merge:


Issues

1. Allocation in Span<byte> overloads — hidden cost on the hot path

In Asn1DetectionHelpers.cs, the ReadOnlySpan<byte> overloads of TryMatchTopLevelContentInfo, TryReadPfxAuthSafe, and TryMatchDerCertificateEnvelope all call src.ToArray() to create a ReadOnlyMemory<byte>:

internal static bool TryMatchTopLevelContentInfo(ReadOnlySpan<byte> src, string expectedContentTypeOid)
    => TryMatchTopLevelContentInfo(new ReadOnlyMemory<byte>(src.ToArray()), expectedContentTypeOid);

src at this call site is already a slice of header (a heap-allocated byte[]), so you're copying up to 1 MB on every detection call even when the file isn't a crypto container. Since ReadOnlyMemory<byte> can be constructed from an array segment without copying, this could instead pass the backing array and offset directly, or the detection pipeline could be refactored to pass ReadOnlyMemory<byte> all the way down (as already done in FileInspector.cs via srcMemory).

Similarly in Signatures.Crypto.cs:

internal static bool TryMatchPkcs12(ReadOnlySpan<byte> src, out ContentTypeDetectionResult? result)
    => TryMatchPkcs12(new ReadOnlyMemory<byte>(src.ToArray()), out result);

The ReadOnlySpan<byte> overloads appear to exist to keep back-compat with the byte-array Detect(byte[]) overload, but since that path already has a ReadOnlySpan<byte> called data, a cleaner fix is to just replace data there with a ReadOnlyMemory<byte> too (as was done in the stream path).

2. TryMatchDerCertificateEnvelope reads optional tbsCertificate fields without consuming them

In the TryMatchDerCertificateEnvelope check, after the mandatory Subject/Validity/SubjectPublicKeyInfo sequences are read, the loop that processes optional ContextSpecific tags [1]–[3] calls tbsCertificate.ReadEncodedValue(), which discards data. This is correct for tbsCertificate, but the check return !certificate.HasData at the end only checks the outer Certificate sequence. If a certificate has trailing garbage after the signatureValue bit string, it would still match. This is probably acceptable for a confidence:Low detector, but worth a comment.

3. TryReadPfxAuthSafe — trailing data consumed silently

if (pfx.HasData)
{
    pfx.ReadEncodedValue();
}
return !pfx.HasData;

This reads at most one extra TLV (the optional macData field in a PFX) but would return false if there are two or more trailing fields. Real PFX files with a MAC will have exactly one, so this should be fine in practice, but it's fragile — a loop or a comment explaining the RFC 7292 PFX structure (version / authSafe / macData?) would help reviewers.

4. Test: DetectionSettingsFixture is empty

DetectionSettingsFixture implements ICollectionFixture<DetectionSettingsFixture> but contains no setup/teardown. The only purpose of the collection is to serialize tests that mutate Settings.DetectionReadBudgetBytes. The fixture's finally blocks already restore the original value, so the collection serialization is correct — but the empty fixture class is a bit confusing. A brief comment explaining why serialization is needed would improve readability.

5. Analyze_PemCertificate_UsesBoundedHeadRead is #if NET8_0_OR_GREATER — but the budget path is also exercised on NET5+

GetCertificateParseReadBudgetBytes() and TryReadFileBytesWithinBudget() are not NET8-gated, and TryReadPemCertificateBlock calls ReadHeadText which is also not NET8-specific. The #if seems to be there only because ExportCertificatePem() is NET5+. Guarding the whole test at NET8_0_OR_GREATER means this coverage is not exercised when building for NET5 or NET6 targets. Consider changing the guard to NET5_0_OR_GREATER (where ExportCertificatePem is available).


Minor / Non-blocking

  • cert = new X509Certificate2(path) in TryLoadCertificateFromFile: The fallback branch (for .crt/.cer DER files) now calls TryReadFileBytesWithinBudget and passes the bytes, which is good. The one slightly odd thing is that cert != null is always true after new X509Certificate2(...) succeeds; the return cert != null expressions are harmless but misleading — they'll never be false unless the constructor throws (which is caught). No change needed, just worth noting.

  • WriteSyntheticOleDirectoryFile internal helper: Good addition. The comment // Our mini CFBF reader maps sector N to offset 512 + ((N + 1) * 512) is useful — just make sure it stays in sync with the actual TryGetOleDirectoryNames implementation if that logic ever changes.

  • Whitespace-only diffs in Signatures.TextAndMarkup.cs and Maps.cs: These should be squashed out before merge to keep the diff clean and avoid noise in git blame.

  • NormalizeMime now calls MimeMaps.TryGetByExtension instead of MimeMaps.Default directly: Correct, since .spc is now only in ExtraMime.Crypto, not in Default. Good catch.


Summary

The core detection logic changes are sound and well-tested. The main ask before merge is addressing the unnecessary ToArray() copies in the Span<byte> bridge overloads, since those run on every file detection call (not just crypto containers). Everything else is low priority. Nice work on the regression fixtures — the "lookalike" negative-test corpus is exactly the right way to guard these detectors.

@claude
Copy link
Copy Markdown

claude bot commented Apr 1, 2026

PR Review: Harden PKCS#7 and certificate detection

Overall this is a solid improvement. Replacing loose OID substring scans with proper System.Formats.Asn1.AsnReader-based structural validation is the right approach, and the test coverage — especially the negative lookalike cases and budget-overflow paths — is exemplary.

A few issues worth discussing before merging:


Bug: Encrypted PFX/P12 files will not be detected

TryMatchPkcs12 validates that the authSafe ContentInfo contentType OID is 1.2.840.113549.1.7.1 (pkcs7-data):

if (!string.Equals(contentTypeOid, "1.2.840.113549.1.7.1", StringComparison.Ordinal))
    return false;

Password-protected (encrypted) .p12/.pfx files use 1.2.840.113549.1.7.6 (pkcs7-encryptedData) as the authSafe contentType. These are common in practice — any PFX exported with a password will use this OID. The new detector will silently miss them, whereas the old heuristic would have caught them because it scanned for BagType OIDs deep inside the structure.

The fix is to also accept 1.2.840.113549.1.7.6:

if (!string.Equals(contentTypeOid, "1.2.840.113549.1.7.1", StringComparison.Ordinal) &&
    !string.Equals(contentTypeOid, "1.2.840.113549.1.7.6", StringComparison.Ordinal))
{
    return false;
}

The test Detect_Pkcs12_By_TopLevel_Pfx_Shape only exercises the unencrypted case (certificate.Export(X509ContentType.Pkcs12) with no password), so this regression is not currently caught.


Performance: unnecessary heap copies in Asn1DetectionHelpers.cs

The ReadOnlySpan<byte> shims all call .ToArray():

internal static bool TryMatchTopLevelContentInfo(ReadOnlySpan<byte> src, string expectedContentTypeOid)
    => TryMatchTopLevelContentInfo(new ReadOnlyMemory<byte>(src.ToArray()), expectedContentTypeOid);

When the stream-based Detect() path is used, the ReadOnlyMemory<byte> overloads are called directly — no allocation. But from Detect(ReadOnlySpan<byte>), DetectCore takes the else branch and calls the span overloads, which each allocate a full copy of the header buffer. Since three crypto matchers run in sequence (PKCS#12, PKCS#7, DER cert), you get up to three independent copies of the same header data when all three miss.

A simpler approach for the span path would be to pin a single MemoryHandle once or just accept this trade-off for the ReadOnlySpan<byte> API surface (which is less commonly used than the stream or byte-array paths). At minimum, the comment near these shims should note the allocation.


Test isolation: cross-collection parallelism risk

DetectionSettingsCollection disables parallelization within the collection, which is the right call for tests that mutate Settings.DetectionReadBudgetBytes. However, xUnit runs different collections in parallel by default. Tests in other collections that read DetectionReadBudgetBytes during a window where the crypto tests have lowered the budget could see incorrect behavior.

Consider whether DetectorTests (or other classes) should join the same collection, or whether Settings mutations should be wrapped in a lock/restore pattern that's also async-safe.


#if NET8_0_OR_GREATER gates the most important new tests

CreateSignedCmsBundle and the tests that rely on it — including the budget-skip and extension-preservation tests (Analyze_P7bCertificateBundle_*, Analyze_SpcCertificateBundle_*, Analyze_P7sSignedData_*) — are wrapped in #if NET8_0_OR_GREATER. If CI targets an older TFM for any build variant, these regression tests simply won't compile or run.

Is the SignedCms API only available on .NET 8+, or was the guard added conservatively? System.Security.Cryptography.Pkcs.SignedCms has been available since .NET Core 2.1 via the System.Security.Cryptography.Pkcs NuGet package. If the package is already referenced, the guard can be relaxed to #if NETCOREAPP2_1_OR_GREATER || NET46_OR_GREATER.


Minor: TryReadTopLevelContentInfo — strict trailing-data check

return !contentInfo.HasData;

This rejects a ContentInfo where the outer SEQUENCE has any bytes beyond the [0] EXPLICIT wrapper. While correct per the spec (a ContentInfo has exactly two fields), some older CMS tooling adds trailing zeros or extended attributes. If TierBridge processes arbitrary third-party-signed payloads, this could be a source of false negatives. Worth documenting the intent even if the strict check is intentional.


Nit: WriteSyntheticOleDirectoryFile — comment inconsistency

The comment says "FAT sector SID 0 lives at 1024", but WriteLe32(header, 0x2C, 1) writes value 1 to DIFAT[0], implying the FAT sector is SID 1 (offset 1536), while the actual fs.Write(fat, ...) is at fs.Position = 1024 (SID 0). The helper works in practice (tests pass), but the comment and the DIFAT entry don't agree. Low priority since this is test-only code.


Summary

Severity Issue
Bug Encrypted PFX/P12 not detected (authSafe OID 1.2.840.113549.1.7.6 not accepted)
Medium Span-to-memory shims allocate unnecessarily; worth documenting or fixing
Medium #if NET8_0_OR_GREATER may silently skip critical new tests on older CI TFMs
Low Cross-collection parallelism could cause flaky budget-sensitive tests
Nit Strict trailing-data check in TryReadTopLevelContentInfo — add a comment
Nit OLE helper comment/DIFAT value mismatch

The structural hardening approach is correct and the budget-bounding logic is well-designed. The encrypted PFX case is the only change I'd flag as a blocker before merging.

@claude
Copy link
Copy Markdown

claude bot commented Apr 2, 2026

PR Review: Harden PKCS#7 and Certificate Detection

Overall this is a solid, well-motivated change. Replacing brittle OID substring scanning with proper ASN.1 structural validation is the right approach for a hot-path detector, and the test coverage is comprehensive. A few points worth discussing:


Design / Architecture

DetectCore dual-branch duplication — The dataMemory.HasValue conditional block in DetectCore creates a parallel code path for three detectors:

if (dataMemory.HasValue)
{
    if (Signatures.TryMatchPkcs12(dataMemory.Value, ...)) ...
    if (Signatures.TryMatchPkcs7SignedData(dataMemory.Value, ...)) ...
    if (Signatures.TryMatchDerCertificate(dataMemory.Value, ...)) ...
}
else
{
    if (Signatures.TryMatchPkcs12(data, ...)) ...      // ToArray() inside
    if (Signatures.TryMatchPkcs7SignedData(data, ...)) ...
    if (Signatures.TryMatchDerCertificate(data, ...)) ...
}

If a 4th crypto detector is added later, both arms need updating. A small helper that produces a ReadOnlyMemory<byte> from either source — or always materialising a memory slice when one isn't provided — would collapse this to a single set of calls.

Span-to-Memory bridge allocations — The span overloads in Asn1DetectionHelpers call src.ToArray() internally. The inline comments warn callers, but the warnings live in Signatures.Crypto.cs, not on the Asn1DetectionHelpers public surface where they would be most discoverable. Consider repeating the note as an XML <remarks> on the span overloads themselves.


Correctness

TryMatchDerCertificateEnvelope — subjectPublicKeyInfo ignored — After reading serial number, signature algorithm, issuer, validity, subject, and one more ReadSequence() (subjectPublicKeyInfo), the optional-extensions loop validates context-specific tags [1]–[3]. That loop is correct per RFC 5280. However, the four bare tbsCertificate.ReadSequence() calls are not labelled; a comment like // issuer, validity, subject, subjectPublicKeyInfo would help reviewers verify the count.

TryReadPfxAuthSafe — macData handling is lenient — The comment says "a second extra TLV is non-conformant", but the current code reads one extra encoded value unconditionally when pfx.HasData, then asserts !pfx.HasData. This correctly allows exactly one macData field; worth a short inline comment explaining why one optional field is allowed (PKCS#12 §4 macData).

TryLoadCertificateFromFile — fall-through for .crt/.cer — When TryReadPemCertificateBlock returns false for a .crt or .cer file, the code intentionally falls through to raw-byte import. The comment says "may still be DER-encoded, so fall through" but the condition controlling the fall-through is a negated if (!TryReadPemCertificateBlock(...)) with a nested early-return for .pem. The logic is correct but the triple-level nesting makes it easy to misread. Extracting the .pem-only early-exit as a guard clause before the PEM-parse block would clarify intent.


Performance

ReadHeadText budget for PEM parseTryReadPemCertificateBlock reads up to GetCertificateParseReadBudgetBytes() characters of text to find the PEM block. If DetectionReadBudgetBytes is set very large (the cap is 8 MB), this allocates a large string on every .pem/.crt/.cer analysis. For most certificate files this is fine, but it may be worth noting that the effective cap for string allocation is 8 MB even if the budget is smaller than the file.

ToArray() on every Detect(ReadOnlySpan<byte>) call — The span path in DetectCore calls TryMatchPkcs12(data, ...)new ReadOnlyMemory<byte>(src.ToArray()). For callers who only have a span (common in stream-based detection), this materialises a fresh heap copy on every detection attempt, even for files that will be rejected in the first few bytes of ASN.1 validation. A quick data[0] != 0x30 guard before the ToArray() copy (mirroring the old code's first check) would avoid the allocation for non-ASN.1 inputs on the span path.


Tests

Detect_Pkcs12_DoesNotMatch_When_Pfx_Version_Is_Not_V3 — The inline byte array encodes version = 100 (0x64) which is a good edge case. However, the authSafe SEQUENCE only contains the OID for PKCS#7-Data (1.2.840.113549.1.7.1) with no actual content wrapper — A0 00 (empty explicit [0]). TryReadPfxAuthSafe should reject this because TryReadExplicitContextSpecificZero would succeed but authSafe.HasData would be false before reaching the explicit wrapper check… actually A0 00 is an empty explicit [0] which passes. The test passes because version ≠ 3. Worth a comment confirming that version is the intended rejection criterion for this fixture (not the missing authSafe content).

#if NET5_0_OR_GREATER guard on budget/bundle tests — The most interesting new tests (Analyze_P7bCertificateBundle_ParsesWithinBudget, _SkipsParsingWhenFileExceedsBudget, SPC/P7S variants) are gated on NET5_0_OR_GREATER. This is necessary for SignedCms.ComputeSignature, but it means those tests don't run on older TFMs. If the library targets netstandard2.0 / .NET Framework, the budget-guard path goes untested on those targets. A tracking note or separate CI job might be worthwhile.

Global mutable stateDetectionSettingsCollection with DisableParallelization = true is the right fix for tests that mutate Settings.DetectionReadBudgetBytes. The try/finally restore pattern is consistently applied. One nit: the tests that do not modify the budget (e.g. the negative detection tests) are still serialised because they share the same [Collection]. Consider moving the purely non-mutating tests out of the collection so they can still run in parallel.


Minor / Cosmetic

  • Several files in the diff show only whitespace / indentation normalisation (e.g. Maps.cs, Signatures.TextAndMarkup.cs). These are harmless but inflate the diff; squashing them into a dedicated formatting commit would make the meaningful changes easier to review in future.
  • The #if NET5_0_OR_GREATER || NET8_0_OR_GREATER guard removed from Latin1String (now just NET5_0_OR_GREATER) is a silent correctness fix worth calling out in the commit message — NET 8 is a superset of NET 5 so the old guard was redundant.

Summary

The core approach — replacing OID-scan heuristics with strict top-level ASN.1 envelope validation using AsnReader — is correct, safe, and well-tested. The security improvement is meaningful: the old OID-prefix search could be fooled by any binary blob that happened to contain those bytes at an arbitrary offset. The main items worth addressing before merge are the DetectCore dual-branch maintainability risk and the lack of a guard to avoid ToArray() allocations for non-ASN.1 inputs on the span path.

@PrzemyslawKlys PrzemyslawKlys marked this pull request as ready for review April 2, 2026 08:32
@PrzemyslawKlys PrzemyslawKlys merged commit c039546 into master Apr 2, 2026
7 checks passed
@PrzemyslawKlys PrzemyslawKlys deleted the codex/fileinspectorx-pkcs7-hardening branch April 2, 2026 08:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant