Packing Intelligence into Fewer Bits: Non-Linear Quantization in LLMs
A 70-billion-parameter LLM stored in 16-bit floats needs roughly 140 GB of memory, more than most GPUs can hold. Quantization shrinks the model by replacing those 16-bit floats with much smaller in...