MachuPikacchu on nostr

Machu Pikacchu @MachuPikacchu 1776178893

Running the 4 bit quant from ggml-org version (although I think I got similar performance from unsloth). Specifically gemma4:26b-q4_k_m.

A good bit of the performance comes from the unified memory because the apple GPU itself is weak.

Here’s a screenshot of it doing OCR on its model card for example with reasoning disabled and it finished in 1.9s at 78.1 t/s