feat: enable F32 output in CpuGemmConv2d #1184

morgolock · 2025-09-19T10:56:49Z

Updated convolution reference to branch epilogue:
- TO=float: int32 to float dequant (acc * sA * sB + bias_f32)
- TO!=float: usual quantize_down_scale_by_fixedpoint with int32 bias
Changed fixture to use F32 bias tensor for Q->F32 runs (instead of S32), matching arm_gemm dequant epilogue which only supports float bias.
Added explicit template instantiations for convolution_layer with TBias=float, TO=float to fix linker errors in validation.
Disabled activation in arm_gemm dequant path: offsets are applied afterwards by CpuGemmLowpOffsetContributionKernel, so activation must run there to see the correct final accumulator.

This aligns target and reference for quantized to F32 convolution tests and prevents premature clamping before offset contributions.

Change-Id: I6fffc98dc0798542a2702e6a593b850c16561e3b

tests/validation/fixtures/ConvolutionLayerFixture.h

tests/validation/NEON/ConvolutionLayer.cpp

tests/validation/fixtures/ConvolutionLayerFixture.h

tests/datasets/SmallConvolutionLayerDataset.h

src/cpu/operators/CpuGemmConv2d.cpp

src/cpu/operators/CpuGemmLowpMatrixMultiplyCore.cpp

tests/validation/NEON/ConvolutionLayer.cpp

src/cpu/operators/CpuConv2d.h

tests/validation/NEON/ConvolutionLayer.cpp

- Updated convolution reference to branch epilogue: * TO=float: int32 to float dequant (acc * sA * sB + bias_f32) * TO!=float: usual quantize_down_scale_by_fixedpoint with int32 bias - Changed fixture to use F32 bias tensor for Q->F32 runs (instead of S32), matching arm_gemm dequant epilogue which only supports float bias. - Added explicit template instantiations for convolution_layer with TBias=float, TO=float to fix linker errors in validation. - Disabled activation in arm_gemm dequant path: offsets are applied afterwards by CpuGemmLowpOffsetContributionKernel, so activation must run there to see the correct final accumulator. - src/cpu/kernels/gemmlowp/generic/neon/impl.h neon_run_offset_contribution_float(): replace per-batch offset for vector_sum_col from Y stride to W stride. This aligns target and reference for quantized to F32 convolution tests and prevents premature clamping before offset contributions. Change-Id: I6fffc98dc0798542a2702e6a593b850c16561e3b Signed-off-by: Pablo Marquez Tello <[email protected]>

tests/validation/reference/ConvolutionLayer.cpp

morgolock requested a review from gunes-arm September 19, 2025 10:56

wajahat-abbas reviewed Sep 19, 2025

View reviewed changes

tests/validation/fixtures/ConvolutionLayerFixture.h Outdated Show resolved Hide resolved

wajahat-abbas reviewed Sep 19, 2025

View reviewed changes

tests/validation/fixtures/ConvolutionLayerFixture.h Outdated Show resolved Hide resolved

morgolock force-pushed the pr/conv_f32_dequant branch 5 times, most recently from d0ad533 to 20c80c0 Compare September 25, 2025 09:01

morgolock force-pushed the pr/conv_f32_dequant branch from 20c80c0 to 976a634 Compare September 30, 2025 07:38

morgolock mentioned this pull request Sep 30, 2025

NEConvolutionLayer \ NEGEMMConvolutionLayer: F32 dequantized output for QASYMM8 \ QASYMM8_SIGNED inputs #1169

Closed

gunes-arm requested changes Oct 3, 2025

View reviewed changes

morgolock force-pushed the pr/conv_f32_dequant branch 4 times, most recently from 02cc3e6 to 4c780b3 Compare October 7, 2025 08:45

gunes-arm reviewed Oct 7, 2025

View reviewed changes

src/cpu/operators/CpuConv2d.h Show resolved Hide resolved

morgolock force-pushed the pr/conv_f32_dequant branch 3 times, most recently from 6a8b64a to 88c1594 Compare October 7, 2025 17:22

gunes-arm requested changes Oct 7, 2025

View reviewed changes

tests/validation/NEON/ConvolutionLayer.cpp Outdated Show resolved Hide resolved

tests/validation/NEON/ConvolutionLayer.cpp Outdated Show resolved Hide resolved

tests/validation/NEON/ConvolutionLayer.cpp Outdated Show resolved Hide resolved

morgolock force-pushed the pr/conv_f32_dequant branch 3 times, most recently from 780b190 to 950a00b Compare October 13, 2025 21:26

morgolock force-pushed the pr/conv_f32_dequant branch from 950a00b to b8824c1 Compare October 16, 2025 09:07

gunes-arm reviewed Oct 16, 2025

View reviewed changes

tests/validation/reference/ConvolutionLayer.cpp Show resolved Hide resolved

gunes-arm approved these changes Oct 16, 2025

View reviewed changes

morgolock merged commit a977868 into main Oct 16, 2025
2 checks passed

morgolock deleted the pr/conv_f32_dequant branch October 16, 2025 14:06

alvoron mentioned this pull request Oct 21, 2025

[CPU][ARM] ACL int8 convolution: add f32 output support openvinotoolkit/openvino#32499

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: enable F32 output in CpuGemmConv2d #1184

feat: enable F32 output in CpuGemmConv2d #1184

Uh oh!

morgolock commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

feat: enable F32 output in CpuGemmConv2d #1184

feat: enable F32 output in CpuGemmConv2d #1184

Uh oh!

Conversation

morgolock commented Sep 19, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants