It appears various forms of using sarif from the CLI to produce human-readable output share a behaviour that is not necessarily ideal. When multiple results exist in the input SARIF file, with each of those results having an artifactLocation property that has a matching uri property, the paths to those results are completely removed in the output.
So, given a SARIF file like so:
{
"runs": [
{
"originalUriBaseIds": {
"PWD": {
"uri": "file:///home/lens_r/Programming/play/LensorCompilerCollection/"
}
},
"results": [
{ "locations": [ { "physicalLocation": { "artifactLocation": { "uri": "./foo.c", "uriBaseId": "PWD" } } } ] , ...},
{ "locations": [ { "physicalLocation": { "artifactLocation": { "uri": "./foo.c", "uriBaseId": "PWD" } } } ] , ...}
]
}
]
}
We get output from sarif emacs above.sarif like so:
-*- compilation -*-
Sarif Summary: GNU C17
Document generated on: 2025-12-15 09:16:01.681255
Total number of distinct issues of all severities (error, warning, note): 1
Severity : error [1]
:2: error expected ‘;’ before ‘}’ token
:6: error expected ‘;’ before ‘}’ token
Severity : warning [0]
Severity : note [0]
What I'd expect to see is foo.c:2: and foo.c:6:; instead, as you can see, the path is completely blank. Methinks this has something to do with a QoL feature which would automagically remove any shared prefix across results, such that a naive SARIF producer may simply place absolute paths (vs the recommended relative ones) and have the user still be fed human-readable paths, but I don't actually have any evidence for that other than a hunch.
NOTE: To produce a valid SARIF file that exhibits this problem, see the attached file, foo.c.sarif.json, or use GCC to make one for you from the following source.
gcc ./foo.c -fdiagnostics-format=sarif-file
// foo.c
int main(){
return 0
}
int main1(){
return 1
}
EDIT: The problem still occurs even if the artifactLocation uses an index property to refer to an artifact within the artifacts array, not just for strings passed to uri. So, the "resolution" happens before the problem occurs, I'd say.
EDIT: More insight: given the two locations differ in suffix but still share a common prefix, the path is split across a path segment somewhat weirdly. i.e. ./foo1.c and ./foo2.c will result in 1.c and 2.c in the output, rather than the expected foo1.c and foo2.c. It seems there is a rather rudimentary transformation applied to the final paths, where any shared prefix is removed without any care for path segments. I propose that it should remove all shared prefix path segments rather than just all shared prefix characters.
It appears various forms of using
sariffrom the CLI to produce human-readable output share a behaviour that is not necessarily ideal. When multiple results exist in the input SARIF file, with each of those results having anartifactLocationproperty that has a matchinguriproperty, the paths to those results are completely removed in the output.So, given a SARIF file like so:
{ "runs": [ { "originalUriBaseIds": { "PWD": { "uri": "file:///home/lens_r/Programming/play/LensorCompilerCollection/" } }, "results": [ { "locations": [ { "physicalLocation": { "artifactLocation": { "uri": "./foo.c", "uriBaseId": "PWD" } } } ] , ...}, { "locations": [ { "physicalLocation": { "artifactLocation": { "uri": "./foo.c", "uriBaseId": "PWD" } } } ] , ...} ] } ] }We get output from
sarif emacs above.sariflike so:What I'd expect to see is
foo.c:2:andfoo.c:6:; instead, as you can see, the path is completely blank. Methinks this has something to do with a QoL feature which would automagically remove any shared prefix across results, such that a naive SARIF producer may simply place absolute paths (vs the recommended relative ones) and have the user still be fed human-readable paths, but I don't actually have any evidence for that other than a hunch.NOTE: To produce a valid SARIF file that exhibits this problem, see the attached file, foo.c.sarif.json, or use GCC to make one for you from the following source.
gcc ./foo.c -fdiagnostics-format=sarif-fileEDIT: The problem still occurs even if the artifactLocation uses an
indexproperty to refer to an artifact within theartifactsarray, not just for strings passed touri. So, the "resolution" happens before the problem occurs, I'd say.EDIT: More insight: given the two locations differ in suffix but still share a common prefix, the path is split across a path segment somewhat weirdly. i.e.
./foo1.cand./foo2.cwill result in1.cand2.cin the output, rather than the expectedfoo1.candfoo2.c. It seems there is a rather rudimentary transformation applied to the final paths, where any shared prefix is removed without any care for path segments. I propose that it should remove all shared prefix path segments rather than just all shared prefix characters.