Skip to content

[phyai] Marlin Kernel & AWQ/GPTQ quantization support #20

@chenghuaWang

Description

@chenghuaWang

Summary

Add support for Marlin INT4 kernels and AWQ/GPTQ quantized linear layers in phyai.

Motivation

Marlin provides efficient FP16xINT4 matrix multiplication kernels for LLM inference. Supporting Marlin with AWQ/GPTQ checkpoints would allow phyai to run common 4-bit quantized models with better memory efficiency and throughput on supported NVIDIA GPUs.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    help wantedExtra attention is needed
    No fields configured for Feature.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions