[phyai] Marlin Kernel & AWQ/GPTQ quantization support

## Summary

Add support for Marlin INT4 kernels and AWQ/GPTQ quantized linear layers in `phyai`.

## Motivation

Marlin provides efficient FP16xINT4 matrix multiplication kernels for LLM inference. Supporting Marlin with AWQ/GPTQ checkpoints would allow `phyai` to run common 4-bit quantized models with better memory efficiency and throughput on supported NVIDIA GPUs.

## References

- https://github.com/IST-DASLab/marlin

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[phyai] Marlin Kernel & AWQ/GPTQ quantization support #20

Summary

Motivation

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[phyai] Marlin Kernel & AWQ/GPTQ quantization support #20

Description

Summary

Motivation

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions