Show HN: HIGGS – new sota data-free LLM quantization

3 points by om8 2 months ago | 0 comments
My colleagues and I wrote a paper and integrated it into transformers.

It has more of both accuracy and speed than NF4

We have compressed hf models for everyone to try: https://huggingface.co/collections/ISTA-DASLab/higgs-675308e...