Hi,
Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?
If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?
I would really appreciate any help in this direction.
Thanks!
Hi,
Has anyone tried deploying a low-precision quantized network (int4, int5, etc.) on NVDLA?
If so, please let me know the steps and if you are able to successfully generate the calibration table using TensorRT and does the hardware supports quantization?
I would really appreciate any help in this direction.
Thanks!