fix(native): support Metal custom V-cache SET_ROWS by lalalune · Pull Request #9303 · elizaOS/eliza

lalalune · 2026-06-24T08:18:13Z

Summary

Points plugins/plugin-local-inference/native/llama.cpp at 6e83e4b9b808bc21100c7846fcc1acd0a0fa674c, which adds Metal SET_ROWS and copy/dequant support for manually selected custom V-cache tensors: tbq3_0, tbq4_0, and q4_polar.
Keeps the parent repo change small: the root commit updates the native llama.cpp submodule pointer and adds the human-verifiable evidence file at .github/issue-evidence/9258-metal-v-cache-set-rows.md.
Rebases the branch onto current origin/develop before final validation.

Root Cause

Manual custom V-cache selections could route cache updates through GGML_OP_SET_ROWS, but the Metal backend did not have destination kernels or dispatch wiring for TBQ3_0, TBQ4_0, or Q4_POLAR. With flash attention enabled, those custom cache tensors could also reach stock attention paths that need a backend-supported dequant/copy path first.

Validation

Native macOS Metal:

cmake --build build-metal-9258 --target test-backend-ops llama-cli llama-completion llama-server -j 12 -> passed
xcrun -sdk macosx metal ... ggml-metal.metal ... -> passed, warnings only
test-backend-ops test -b MTL0 -o SET_ROWS -p "(tbq3_0|tbq4_0|q4_polar)" -> 12/12 passed
test-backend-ops test -b MTL0 -o CPY -p "(tbq3_0|tbq4_0|q4_polar)" -> 6/6 passed
Real GGUF llama-cli smoke runs with -fa on -ctv tbq3_0, tbq4_0, and q4_polar -> all generated tokens and exited 0
Real GGUF llama-completion smoke runs with the same three cache types -> all exited 0

Node/web HTTP path:

llama-server built and served the real GGUF model on 127.0.0.1:19058
/completion HTTP requests for tbq3_0, tbq4_0, and q4_polar each returned JSON with tokens_predicted: 4 and exited 0

iOS / Apple-platform packaging and runtime:

xcrun -sdk iphoneos metal ... ggml-metal.metal ... -> passed, warnings only
xcrun -sdk iphonesimulator metal ... ggml-metal.metal ... -> passed, warnings only
ELIZA_MTP_FORCE_REBUILD=1 node packages/app-core/scripts/build-llama-cpp-mtp.mjs --target ios-arm64-metal -> passed
ELIZA_MTP_FORCE_REBUILD=1 node packages/app-core/scripts/build-llama-cpp-mtp.mjs --target ios-arm64-simulator-metal -> passed
node packages/app-core/scripts/ios-xcframework/build-xcframework.mjs --output /tmp/LlamaCpp-9258.xcframework --verify -> passed; device and simulator kernel/runtime symbol audits passed, slices ios/arm64 and ios-simulator/arm64
bun run --cwd packages/app build:ios:local:sim -> passed with ** BUILD SUCCEEDED **
Physical iPhone XCTest via run-physical-device-smoke.mjs -> passed on an iPhone 16 Pro Max; testLibElizaInferenceAbiV1CallsMatchHeader, testLlamaKernelAndVoiceSymbolsResolve, and testMetalDeviceIsAvailableOnPhysicalIos passed, optional benchmark skipped because no model was bundled

Repo/package gates:

bun run --cwd plugins/plugin-native-llama test -> 4 files passed, 35 tests passed
bun run --cwd plugins/plugin-local-inference test -> 201 files passed, 1 skipped; 2065 tests passed, 13 skipped
bun install -> passed
bun run verify -> passed, 509 successful, 509 total

Evidence:

.github/issue-evidence/9258-metal-v-cache-set-rows.md

coderabbitai · 2026-06-24T08:18:22Z

Important

Review skipped

Auto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cf1fffc0-e1f9-43b5-961d-6eae9ed2fd4e

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/metal-v-cache-set-rows-9258

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

github-actions · 2026-06-24T09:10:12Z

❌ PR title does not match the required pattern. Please use one of these formats:

'type: description' (e.g., 'feat: add new feature')
'type(scope): description' (e.g., 'chore(core): update dependencies')
Valid types: feat, fix, docs, style, refactor, perf, test, build, ci, chore, revert, release

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

claude · 2026-06-24T11:08:39Z

Claude encountered an error —— View job

I'll analyze this and get back to you.

lalalune force-pushed the fix/metal-v-cache-set-rows-9258 branch from 446b92a to ff9cdd8 Compare June 24, 2026 09:46

lalalune marked this pull request as ready for review June 24, 2026 09:48

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

lalalune mentioned this pull request Jun 24, 2026

Metal: custom V-cache (tbq3_0/tbq4_0/q4_polar) SET_ROWS aborts under manual --cache-type-v override #9258

Closed

github-actions Bot added Docs ci plugins labels Jun 24, 2026

lalalune changed the title ~~[codex] fix Metal custom V-cache set rows~~ fix(native): support Metal custom V-cache SET_ROWS Jun 24, 2026

lalalune force-pushed the fix/metal-v-cache-set-rows-9258 branch from ff9cdd8 to 1cf1fbe Compare June 24, 2026 09:57

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

fix metal custom v-cache set rows

fc6180f

lalalune force-pushed the fix/metal-v-cache-set-rows-9258 branch from 1cf1fbe to fc6180f Compare June 24, 2026 10:06

greptile-apps Bot reviewed Jun 24, 2026

View reviewed changes

lalalune merged commit f567171 into develop Jun 24, 2026
27 checks passed

lalalune deleted the fix/metal-v-cache-set-rows-9258 branch June 24, 2026 10:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(native): support Metal custom V-cache SET_ROWS#9303

fix(native): support Metal custom V-cache SET_ROWS#9303
lalalune merged 1 commit into
developfrom
fix/metal-v-cache-set-rows-9258

lalalune commented Jun 24, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

Review skipped

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

greptile-apps Bot left a comment

Uh oh!

Uh oh!

claude Bot commented Jun 24, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

lalalune commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Root Cause

Validation

Uh oh!

coderabbitai Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Uh oh!

github-actions Bot commented Jun 24, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude Bot commented Jun 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

lalalune commented Jun 24, 2026 •

edited

Loading

coderabbitai Bot commented Jun 24, 2026 •

edited

Loading

claude Bot commented Jun 24, 2026 •

edited

Loading