Damus
librekitty profile picture
librekitty
@librekitty
i finally got gemma4 to fit in VRAM, the KV cache in a brutal size ๐Ÿค–๐Ÿ™€

quantization is such a blessing ๐Ÿ™
71๐Ÿ‘€1๐Ÿ’ช1๐Ÿ”ฅ1
unixmonks · 4d
gemma4 is pretty good, you enjoying?
eardiod · 3d
Did you use the compressed model from nvidia? I so far tried only 4bit og, but this new nvidia compression seems to be even better