feat(evals): add Scala retrieval eval and fix dropped infix/field-expression Scala calls#567
Conversation
…ix and field-expression calls
…ll target extractor
|
@greptile review |
There was a problem hiding this comment.
Code Review
This pull request introduces multi-language retrieval evaluation for Scala, comparing cgr's Scala call graph against an independent oracle built using scalameta (run via scala-cli). It also fixes a bug in call_processor.py where Scala infix operator calls and nullary method calls were dropped, and adds corresponding tests, constants, logging, and documentation. The review feedback suggests improving error handling in evals/oracles/scala_oracle.py when parsing the JSON output of the subprocess, specifically by catching JSONDecodeError and including stderr and stdout in the error message to facilitate debugging.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| proc = subprocess.run( | ||
| [scala_cli, ec.SCALA_CLI_RUN, str(_SOURCE), ec.SCALA_CLI_ARG_SEP, str(target)], | ||
| cwd=str(_ORACLE_DIR), | ||
| capture_output=True, | ||
| text=True, | ||
| check=True, | ||
| ) | ||
| payload: OraclePayload = json.loads(proc.stdout or "{}") |
There was a problem hiding this comment.
Since capture_output=True is used, any warnings or errors printed to stderr (or unexpected output on stdout) will be captured and not visible to the user if json.loads fails with a JSONDecodeError. To make debugging easier, consider catching json.JSONDecodeError and logging or raising an exception that includes proc.stderr and proc.stdout.
There was a problem hiding this comment.
Fixed in cd32275: _run_scala_oracle_payload now catches json.JSONDecodeError and raises a RuntimeError that includes proc.stderr and proc.stdout (via SCALA_ORACLE_DECODE_FAILED), so a compile error or changed launcher banner is surfaced instead of a context-free decode failure.
Greptile SummaryThis PR adds a Scala retrieval eval and updates Scala call extraction. The main changes are:
Confidence Score: 4/5The Scala retrieval evaluation and call extraction changes are narrowly scoped, with attention needed only for comment-policy compliance in the new oracle file. The functional changes are covered by the added Scala retrieval fixture tests and described broader test runs; the remaining concern is a project-style hook violation in explanatory inline comments. evals/oracles/scala_oracle/Oracle.scala
What T-Rex did
|
Greptile SummaryThis PR adds Scala retrieval evaluation and updates Scala call extraction. The main changes are:
Confidence Score: 4/5The Scala eval additions are mostly contained, but the call extraction change can create incorrect graph edges for ordinary field reads. The review is focused on the changed parser and eval paths, and the reported behavior was confirmed with a targeted Scala indexing case. codebase_rag/parsers/call_processor.py
What T-Rex did
|
… not ambiguous bare selects
|
@greptile review |
|
On the remaining 'bare field_expression calls still dropped' point: this is intentional, and it directly implements the earlier P1 review feedback. Scala's uniform access makes a nullary method call ( |
What
Adds a Scala CALLS-vs-
scalametaretrieval eval (the last cgr-supported language without one) and fixes a real cgr bug it surfaced.The eval
evals/scala_retrieval.pygrades cgr's ScalaCALLSedges, reduced to(caller_file, callee_simple_name), against call sites extracted byscalametaover the same first-party name universe (every declareddef). The oracle (evals/oracles/scala_oracle/Oracle.scala, run viascala-cli) is independent of cgr's tree-sitter frontend. It grades only files it parses cleanly (covered set; the cgr side is held to the same files), so a Scala 3 file the 2.13 parser rejects penalizes neither side. Requiresscala-clionPATH; the eval exits cleanly if absent. Pinned bycodebase_rag/tests/test_scala_retrieval_eval.py.The bug
cgr's Scala spec lists
infix_expressionandfield_expressionas call node types, so both are collected, butcall_processor._get_call_target_namehad no case for either:infix_expression(a ~> b,xs map f) exposes its callee through anoperatorfield;obj.done) is a barefield_expressionwhose callee is itsfield.Neither has the
functionfield the extractor keyed on, so every such call returned no name and never attached. The fix adds two Scala-gated cases (thefield_expressionnode type collides with C++, so the guard keeps other languages inert).Result
On
scala-parser-combinators(a combinator DSL, so infix-operator heavy): recall 0.40 → 0.73, F1 0.57 → 0.84, precision held at 1.0 (zero false positives).scoptshows the same shape (precision 1.0, recall 0.71). The residual is the diffuse receiver-type-inference tail every language eval carries (implicit conversions such as"" ~>, deeply generic receivers), not a systematic gap.Testing
RED→GREEN in history: the eval + fixture test commit is red for the infix/field-expression assertions; the fix commit turns it green. Full non-integration suite (4121 passed) and integration suite (142 passed, serial) green; call/language-focused subset (1541 passed) confirms no cross-language regression.