-
Notifications
You must be signed in to change notification settings - Fork 57
ISSUE/346: 支持CublasLT,实现了LT的linear,fp8 linear,fp8 block-wise linear,和fp8 的 group-wise quant #510
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
有些代码需要做一些格式的format |
abb321d to
104c424
Compare
…cuBLASLt and fp8 group-wise quant
| INFINI_STATUS_BAD_TENSOR_SHAPE = 11, | ||
| INFINI_STATUS_BAD_TENSOR_STRIDES = 12, | ||
| INFINI_STATUS_INSUFFICIENT_WORKSPACE = 13, | ||
| INFINI_STATUS_NOT_ALIGNED = 14, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
需要在InfiniCore/src/utils/infini_status_string.h中添加string
| INFINI_DTYPE_C64 = 17, | ||
| INFINI_DTYPE_C128 = 18, | ||
| INFINI_DTYPE_BF16 = 19, | ||
| INFINI_DTYPE_F8_E4M3 = 20, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个会影响python前端接口,和 @voltjia 过一下
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这个没关系,我在 infinicore Python 层还会再封装一次,所以只要能跟 torch 里的对上号就行。
| __C __export infiniStatus_t | ||
| infiniopDestroyLinearDescriptor(infiniopLinearDescriptor_t desc); | ||
|
|
||
| #endif No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
末尾空行
| from numpy.lib.stride_tricks import as_strided | ||
|
|
||
| from .. import InfiniopTestWriter, InfiniopTestCase, np_dtype_to_ggml, gguf_strides, contiguous_gguf_strides, process_zero_stride_tensor | ||
| from .. import ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这些无关文件的更改都回滚吧
| @@ -0,0 +1,406 @@ | |||
| import torch | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
如果测试都是分成两个,那是不是分成两个算子会更好一点?
|
|
||
| typedef struct InfiniopDescriptor *infiniopQuantizeDescriptor_t; | ||
|
|
||
| __C __export infiniStatus_t infiniopCreateQuantizeDescriptor( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
这不是通用的quantize,在命名的时候应该注明是什么quantize
| #include <utility> | ||
|
|
||
| /** | ||
| * @brief Define the process for initializing a Descriptor of an elementwise operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请不要在无关文件进行格式修改
|
写一下接口设计文档,有的参数的含义需要解释 |
A100 只支持F16及以上精度





H100 支持FP8E4M3和Fp8E5M2
FP8-Block-Wise
FP8
其他
FP8-Quantize-Group-Wise