Show HN: HIGGS – new sota data-free LLM quantization
3 points by om8 2 months ago | 0 commentsMy colleagues and I wrote a paper and integrated it into transformers.
It has more of both accuracy and speed than NF4
We have compressed hf models for everyone to try: https://huggingface.co/collections/ISTA-DASLab/higgs-675308e...