ggml-hexagon: fix `rope` failure at `test-backend-ops` #17565

chraac · 2025-11-28T03:34:39Z

Changes

Fix rope op implementation, after these changes, the impl correctly handles the operation and passes all rope tests.

Before

[ROPE] NMSE = 1.881769338 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0,inplace=0): �[1;31mFAIL�[0m
[ROPE] NMSE = 1.916275620 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0,inplace=0): �[1;31mFAIL�[0m
[ROPE] NMSE = 1.883398782 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0,inplace=0): �[1;31mFAIL�[0m
[ROPE] NMSE = 1.873313624 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0,inplace=0): �[1;31mFAIL�[0m
[ROPE] NMSE = 1.924327319 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0,inplace=1): �[1;31mFAIL�[0m
[ROPE] NMSE = 1.974410582 > 0.000000100   ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0,inplace=1): �[1;31mFAIL�[0m

After

  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=0,v=0,inplace=0): �[1;32mOK�[0m
  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.000000,ff=1,v=0,inplace=0): �[1;32mOK�[0m
  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0,inplace=0): �[1;32mOK�[0m
  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0,inplace=0): �[1;32mOK�[0m
  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=0,v=0,inplace=1): �[1;32mOK�[0m
  ROPE(type=f32,ne_a=[128,32,2,1],n_dims=128,mode=0,n_ctx=512,fs=1.424500,ef=0.746500,af=1.424500,ff=1,v=0,inplace=1): �[1;32mOK�[0m

chraac · 2025-11-28T03:47:42Z

ggml/src/ggml-hexagon/htp/rope-ops.c

                }
+
+                // TODO: use simd to speed up the remaining elements copy
+                memcpy(dst_data_loc, src_loc, (ne0 - rope_ctx->n_dims) * sizeof(float));


QQ: did we have simd acceleration in the memcpy?

chraac · 2025-11-30T16:18:24Z

ggml/src/ggml-hexagon/htp/ops-utils.h

+        const uint64_t name##_end_cycles = HAP_perf_get_qtimer_count();                                           \
+        FARF(HIGH, __VA_ARGS__, (unsigned) HAP_perf_qtimer_count_to_us(name##_end_cycles - name##_start_cycles)); \
+    } while (0)
+


This macro provides a convenient way to handle profiling logs within the NPU. It allows profiling logs to be easily disabled via a compiler flag.

chraac · 2025-11-30T16:20:58Z

ggml/src/ggml-hexagon/htp/rope-ops.c

-                }
-                if (ir > ir1) {
-                    break;
-                }


Those two inner if statements can be merged into the for loop's condition.

chraac added 7 commits November 27, 2025 12:54

fix test failure

407b408

fix: correct scaling calculations in rope_cache_init

4ddb8a4

wip

cfca78b

wip

e9a02fd

fix: optimize element copying in rope_hex_f32 using memcpy

e324bb0

fix: optimize loop boundaries in rope_hex_f32 for better performance

0121291

rename

010039a

chraac requested review from lhez and max-krasnyansky as code owners November 28, 2025 03:34

github-actions bot added the ggml changes relating to the ggml tensor library for machine learning label Nov 28, 2025

chraac commented Nov 28, 2025

View reviewed changes

chraac added 4 commits November 28, 2025 12:10

wip

a6ef41f

Merge branch 'master' into dev-fix-rope

0376146

Merge tag 'b7207' into dev-fix-rope

8abecfa

feat: add profiling macros for performance measurement in operations

b567413

chraac commented Nov 30, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ggml-hexagon: fix `rope` failure at `test-backend-ops` #17565

ggml-hexagon: fix `rope` failure at `test-backend-ops` #17565

chraac commented Nov 28, 2025 •

edited

Loading

Uh oh!

chraac Nov 28, 2025

Uh oh!

chraac Nov 30, 2025

Uh oh!

chraac Nov 30, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggml-hexagon: fix rope failure at test-backend-ops #17565

Are you sure you want to change the base?

ggml-hexagon: fix rope failure at test-backend-ops #17565

Conversation

chraac commented Nov 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Before

After

Uh oh!

chraac Nov 28, 2025

Choose a reason for hiding this comment

Uh oh!

chraac Nov 30, 2025

Choose a reason for hiding this comment

Uh oh!

chraac Nov 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

ggml-hexagon: fix `rope` failure at `test-backend-ops` #17565

ggml-hexagon: fix `rope` failure at `test-backend-ops` #17565

chraac commented Nov 28, 2025 •

edited

Loading

chraac Nov 30, 2025 •

edited

Loading