开发日志[第五周]:优化NPU驱动,实现了NPU多任务并发。原来每次只能使用单核同时执行/提交1个任务。现在能让3核共同工作,提升推理速度 #5
Dirinkbottle
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
第五周开发日志(2.22-2.28)
工作总结
matmul_npu_3core_qkv函数,将 QKV 三个矩阵乘法并行提交到 3 个 NPU 核心,板端验证推理成功submit_ioctrl实现批量任务分配,支持一次 ioctl 向多个核心提交任务,新增wait_all_npucore并行等待机制验证结果
板端运行日志显示多核并行成功:
3 tasks, 3 cores表示 QKV 三核并行成功1 tasks, 1 core表示后续的wo矩阵乘法(单核)驱动多核提交流程重构
批量任务分配
重构
submit_ioctrl函数,支持将用户空间的任务数组自动分配到多个 NPU 核心:并行等待机制
新增
wait_all_npucore函数,实现多核心并行等待:Beta Was this translation helpful? Give feedback.
All reactions