feat(evals): add Scala retrieval eval and fix dropped infix/field-expression Scala calls by vitali87 · Pull Request #567 · vitali87/code-graph-rag

vitali87 · 2026-07-01T22:02:28Z

What

Adds a Scala CALLS-vs-scalameta retrieval eval (the last cgr-supported language without one) and fixes a real cgr bug it surfaced.

The eval

evals/scala_retrieval.py grades cgr's Scala CALLS edges, reduced to (caller_file, callee_simple_name), against call sites extracted by scalameta over the same first-party name universe (every declared def). The oracle (evals/oracles/scala_oracle/Oracle.scala, run via scala-cli) is independent of cgr's tree-sitter frontend. It grades only files it parses cleanly (covered set; the cgr side is held to the same files), so a Scala 3 file the 2.13 parser rejects penalizes neither side. Requires scala-cli on PATH; the eval exits cleanly if absent. Pinned by codebase_rag/tests/test_scala_retrieval_eval.py.

The bug

cgr's Scala spec lists infix_expression and field_expression as call node types, so both are collected, but call_processor._get_call_target_name had no case for either:

an infix_expression (a ~> b, xs map f) exposes its callee through an operator field;
a paren-less nullary call (obj.done) is a bare field_expression whose callee is its field.

Neither has the function field the extractor keyed on, so every such call returned no name and never attached. The fix adds two Scala-gated cases (the field_expression node type collides with C++, so the guard keeps other languages inert).

Result

On scala-parser-combinators (a combinator DSL, so infix-operator heavy): recall 0.40 → 0.73, F1 0.57 → 0.84, precision held at 1.0 (zero false positives). scopt shows the same shape (precision 1.0, recall 0.71). The residual is the diffuse receiver-type-inference tail every language eval carries (implicit conversions such as "" ~>, deeply generic receivers), not a systematic gap.

Testing

RED→GREEN in history: the eval + fixture test commit is red for the infix/field-expression assertions; the fix commit turns it green. Full non-integration suite (4121 passed) and integration suite (142 passed, serial) green; call/language-focused subset (1541 passed) confirms no cross-language regression.

…ix and field-expression calls

…ll target extractor

vitali87 · 2026-07-01T22:02:33Z

@greptile review

gemini-code-assist

Code Review

This pull request introduces multi-language retrieval evaluation for Scala, comparing cgr's Scala call graph against an independent oracle built using scalameta (run via scala-cli). It also fixes a bug in call_processor.py where Scala infix operator calls and nullary method calls were dropped, and adds corresponding tests, constants, logging, and documentation. The review feedback suggests improving error handling in evals/oracles/scala_oracle.py when parsing the JSON output of the subprocess, specifically by catching JSONDecodeError and including stderr and stdout in the error message to facilitate debugging.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

gemini-code-assist · 2026-07-01T22:05:35Z

+    proc = subprocess.run(
+        [scala_cli, ec.SCALA_CLI_RUN, str(_SOURCE), ec.SCALA_CLI_ARG_SEP, str(target)],
+        cwd=str(_ORACLE_DIR),
+        capture_output=True,
+        text=True,
+        check=True,
+    )
+    payload: OraclePayload = json.loads(proc.stdout or "{}")


Since capture_output=True is used, any warnings or errors printed to stderr (or unexpected output on stdout) will be captured and not visible to the user if json.loads fails with a JSONDecodeError. To make debugging easier, consider catching json.JSONDecodeError and logging or raising an exception that includes proc.stderr and proc.stdout.

Fixed in cd32275: _run_scala_oracle_payload now catches json.JSONDecodeError and raises a RuntimeError that includes proc.stderr and proc.stdout (via SCALA_ORACLE_DECODE_FAILED), so a compile error or changed launcher banner is surfaced instead of a context-free decode failure.

greptile-apps · 2026-07-01T22:07:11Z

Greptile Summary

This PR adds a Scala retrieval eval and updates Scala call extraction. The main changes are:

New scalameta-based Scala oracle run through scala-cli.
New evals.scala_retrieval CLI for scoring cgr Scala CALLS edges.
Scala infix operator call names extracted from the tree-sitter operator field.
Bare paren-less selects excluded from both cgr and oracle grading.
Fixture tests and docs for the Scala retrieval workflow.

Confidence Score: 4/5

The Scala retrieval evaluation and call extraction changes are narrowly scoped, with attention needed only for comment-policy compliance in the new oracle file.

The functional changes are covered by the added Scala retrieval fixture tests and described broader test runs; the remaining concern is a project-style hook violation in explanatory inline comments.

evals/oracles/scala_oracle/Oracle.scala

T-Rex Logs

What T-Rex did

Compared the scala-eval before and after logs to verify changes, noting that the after state includes a head commit, absence of scala-cli, and a stubbed execution proof for score and diff outputs.
Reviewed the scala-calls before and after logs, observing that the infix_expression 't ~~> T.make()' now yields '~~>' while the field_expression 'u.done' remains None.

_{Ran code and verified through T-Rex}

Comments Outside Diff (1)

General comment

Scala bare nullary field_expression calls are still dropped
- Bug
  - The PR contract says Scala-gated handling should make paren-less nullary field_expression calls expose the field, e.g. u.done should produce a done target and therefore a user-visible CALLS edge when done is a declared first-party method. Executed head behavior still returns None for the parsed field_expression 'u.done', so the call target path continues to drop that edge. Infix operator handling is fixed, but the field-expression half of the described fix is not delivered.
- Cause
  - codebase_rag/parsers/call_processor.py around the Scala-gated infix case (lines 847-859 on head) explicitly handles only TS_SCALA_INFIX_EXPRESSION and comments that bare field_expression is deliberately not named, leaving Scala nullary selects to fall through to None.
- Fix
  - Add the promised Scala-gated field_expression target extraction for the appropriate nullary-call scenario, returning the field child text (e.g. done) without affecting other languages; then add/adjust a regression test that exercises u.done and verifies the corresponding CALLS edge or target is present.
_{Ran code and verified through T-Rex}

_{Reviews (3): Last reviewed commit: "refactor(scala): scope call retrieval to..." | Re-trigger Greptile}

greptile-apps · 2026-07-01T22:08:18Z

Greptile Summary

This PR adds Scala retrieval evaluation and updates Scala call extraction. The main changes are:

New scalameta oracle and scala-cli wrapper for Scala call-edge evaluation.
New evals.scala_retrieval driver for scoring cgr Scala CALLS edges against oracle calls.
Scala call target extraction for infix expressions and paren-less field expressions.
Fixture tests and README coverage for the new Scala retrieval eval.

Confidence Score: 4/5

The Scala eval additions are mostly contained, but the call extraction change can create incorrect graph edges for ordinary field reads.

The review is focused on the changed parser and eval paths, and the reported behavior was confirmed with a targeted Scala indexing case.

codebase_rag/parsers/call_processor.py

T-Rex Logs

What T-Rex did

Reproduced the focused Scala field-read over-resolution in the repro harness using Holder.done as a val, Other.done as a first-party def, and Use.readField reading only obj.done; the repro shows an assertion failure where a CALLS edge was emitted for the non-method field read, with logs indicating a trie fallback to scala_field_over_resolve.Other.Other.done.
Blocked environment/setup for Scala eval path: base availability failed due to an environment/build dependency issue before any Scala eval could exist, and head availability reported scala-cli None with the import stack failing before oracle/scoring could run; a base missing-tool command also failed with a ModuleNotFoundError for defusedxml before Scala paths could be exercised.
Compared Scala infix-call behavior across runs: the base run had no CGR_CALL_EDGES and HAS_INFIX_EDGE was False, while the head run showed a CGR_CALL_EDGES entry for ('T.scala', '~~>') and HAS_INFIX_EDGE True, with the resolver logging '~~> -> scala_infix_probe.T.T.~>'; the Python sanity run also reported Python_CALLS and HAS_HELPER_NAME True.

_{Ran code and verified through T-Rex}

Comments Outside Diff (1)

General comment

Scala retrieval CLI can traceback before the clean missing-tool exit
- Bug
  - Running python -m evals.scala_retrieval on head with PATH set so scala-cli cannot be found does not reach the advertised clean Typer exit. The module import first executes evals/__init__.py, which imports the broader CGR graph stack and fails on missing transitive dependencies (defusedxml in the captured run), producing an unhandled traceback before main() checks scala_available(). This violates the claimed CLI/toolchain behavior for absent scala-cli.
- Cause
  - The CLI module is executed as part of the evals package, whose package initialization eagerly imports cgr_graph/codebase_rag dependencies before evals.scala_retrieval.main() can perform the explicit scala_available() check and raise typer.Exit.
- Fix
  - Avoid eager heavyweight imports on the python -m evals.scala_retrieval path before the missing-scala-cli guard. For example, make evals/__init__.py lazy/minimal, or move CGR-dependent imports inside functions after the scala_available() check so the CLI can report the missing Scala toolchain cleanly.
_{Ran code and verified through T-Rex}

Prompt To Fix All With AI

Fix the following 1 code review issue. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 1
codebase_rag/parsers/call_processor.py:859-862
**Field accesses over-resolve**
Returning the bare `field` name for every Scala `field_expression` makes non-call field reads resolve through `resolve_function_call`'s simple-name fallback. When `obj.done` is a `val` or field and any first-party `def done` exists, this emits a false `CALLS` edge even though no method was invoked.

_{Reviews (2): Last reviewed commit: "fix(scala): name infix-operator and pare..." | Re-trigger Greptile}

… emits non-JSON

… not ambiguous bare selects

vitali87 · 2026-07-01T22:17:54Z

@greptile review

vitali87 · 2026-07-01T22:35:41Z

On the remaining 'bare field_expression calls still dropped' point: this is intentional, and it directly implements the earlier P1 review feedback. Scala's uniform access makes a nullary method call (obj.done) and a plain field read (obj.done where done is a val) syntactically identical, so naming a bare field_expression/Term.Select as a call would turn a same-named field read into a spurious CALLS edge (the exact over-resolution the first review flagged P1 on both the oracle and cgr sides). The two are irreducible without receiver-type inference, so both cgr's extractor and the oracle deliberately grade application/infix sites only. Empirically this costs no recall (bare-select resolution added zero edges on scala-parser-combinators and scopt; recall 0.73->0.77, precision 1.0). test_scala_retrieval_eval.py pins the decision with assert ('Use.scala','done') not in edges. Delivering the field_expression edge and avoiding the field-read FP are mutually exclusive here; this PR chooses precision, consistent with the P1.

vitali87 added 2 commits July 1, 2026 23:01

test(evals): add Scala CALLS-vs-scalameta retrieval eval, red for inf…

27769c1

…ix and field-expression calls

fix(scala): name infix-operator and paren-less method calls in the ca…

30f1d30

…ll target extractor

github-project-automation Bot added this to @vitali87's graph code Jul 1, 2026

gemini-code-assist Bot reviewed Jul 1, 2026

View reviewed changes

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

Comment thread evals/oracles/scala_oracle/Oracle.scala Outdated

greptile-apps Bot reviewed Jul 1, 2026

View reviewed changes

Comment thread codebase_rag/parsers/call_processor.py Outdated

vitali87 added 2 commits July 1, 2026 23:11

fix(evals): raise with scala-cli stderr context when the Scala oracle…

cd32275

… emits non-JSON

refactor(scala): scope call retrieval to application and infix sites,…

5320217

… not ambiguous bare selects

vitali87 merged commit 0b4fab5 into main Jul 1, 2026
2 of 15 checks passed

github-project-automation Bot moved this to Done in @vitali87's graph code Jul 1, 2026

vitali87 deleted the eval/scala-retrieval branch July 1, 2026 22:44

vitali87 mentioned this pull request Jul 1, 2026

fix(cpp): resolve typedef and using-aliased receiver types to the underlying class #568

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(evals): add Scala retrieval eval and fix dropped infix/field-expression Scala calls#567

feat(evals): add Scala retrieval eval and fix dropped infix/field-expression Scala calls#567
vitali87 merged 4 commits into
mainfrom
eval/scala-retrieval

vitali87 commented Jul 1, 2026

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Uh oh!

vitali87 Jul 1, 2026

Uh oh!

greptile-apps Bot commented Jul 1, 2026 •

edited

Loading

T-Rex Logs

Comments Outside Diff (1)

Uh oh!

Uh oh!

greptile-apps Bot commented Jul 1, 2026 •

edited

Loading

T-Rex Logs

Comments Outside Diff (1)

Uh oh!

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

vitali87 commented Jul 1, 2026

What

The eval

The bug

Result

Testing

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

vitali87 Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

T-Rex Logs

Comments Outside Diff (1)

Uh oh!

Uh oh!

greptile-apps Bot commented Jul 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

T-Rex Logs

Comments Outside Diff (1)

Uh oh!

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

vitali87 commented Jul 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jul 1, 2026 •

edited

Loading

greptile-apps Bot commented Jul 1, 2026 •

edited

Loading