Skip to content

Conversation

@ngxson
Copy link
Collaborator

@ngxson ngxson commented Nov 29, 2025

This may silently fail if mul_mat return something other than F32 in the future, so I'm using ggml_row_size to be more future-proof.

@ngxson ngxson requested review from CISC and ggerganov November 29, 2025 15:19
@CISC
Copy link
Collaborator

CISC commented Nov 29, 2025

Did I miss something? I don't think we'll ever expect it to return quantized tensors, so this is surely overkill? If expecting anything other than F32 (in which case, we need to fix up a lot of places) using d_head*ggml_element_size(cur) should be enough?

@ngxson
Copy link
Collaborator Author

ngxson commented Nov 29, 2025

@CISC honestly I don't know if hard-coding sizeof(float) is a good practice in GGML overall, even when the tensor is guaranteed to be F32 (who knows, maybe a novel backend will store the tensor using just 31 bits instead of full 32 bits? hmm well bad example, in this case other things will break before this 😄 )

ggml_element_size(cur) * N is also OK, indeed I don't know what's the recommendation.

Not sure if you already came across this pattern somewhere else @ggerganov , maybe in whisper.cpp or somewhere else. What's your recommendation here?

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using ggml_row_size seems like the most future-proof way. I agree that explicit sizeof(float) should be avoided, even if it does not matter today.

@ngxson ngxson merged commit 7f8ef50 into ggml-org:master Nov 30, 2025
73 of 74 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants