Skip to content

Conversation

@alvoron
Copy link
Contributor

@alvoron alvoron commented Oct 31, 2025

Details:

  • Before the fix ACL int8 convolution executor was chosen for fp32 bias case. Previous type mapping forces fp32 to int32 conversion, which led to accuracy degradation
  • Type mapping has been fixed to accept i32 bias only
  • If bias is not i32 then such case is handled by dnnl executor. To do that the order of ARM executors has been changed: int8 executor first, default dnnl executor next.

Tickets:

@alvoron alvoron requested review from a team as code owners October 31, 2025 12:24
@alvoron alvoron added the platform: arm OpenVINO on ARM / ARM64 label Oct 31, 2025
@github-actions github-actions bot added the category: CPU OpenVINO CPU plugin label Oct 31, 2025
@EgorDuplensky
Copy link
Contributor

Do we know why the bias is i32 in the first place?
I think this is because from ACL point of view we are supposed to quantize bias (apply activation and weights scale on it).
The question is when to do it.
Ideally it should be done by ov pass (replace 'conv_int8+bias_f32' with 'conv_int8+bias_i32' by applying scales on bias const)
Thus, we will not need to fallback to slower oneDNN implementation.

@alvoron alvoron requested a review from v-Golubev November 3, 2025 17:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: CPU OpenVINO CPU plugin platform: arm OpenVINO on ARM / ARM64

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants