Skip to content

Conversation

quic-dhirajku
Copy link
Contributor

Updated the run_vlm_kv_model_on_pytorch and run_vlm_kv_model_on_ort methods to run for the latest dual QPC setup. Along with the required changes to be made in the Input Handler of VLMs.

Also updated the way head_dim is calculated for past_key_value creation as certain models now provide specific head_dim. We fallback to previous method if the parameter isn't found in the config.

…ethods to run for the latest dual QPC setup. Along with the required changes to be made in the Input Handler of VLMs.

Also updated the way head_dim is calculated for past_key_value creation as certain models now provide specific head_dim. We fallback to previous method if the parameter isn't found in the config.

Signed-off-by: Dhiraj Kumar Sah <[email protected]>
@quic-hemagnih quic-hemagnih merged commit e905575 into quic:main Oct 8, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants