Damus
PayPerQ · 2w
Since we proxy to openrouter for this particular model, you can use any parameters that OR accepts and force 8 bit quantization if you like. We could be better about advertising this feature though. ...
redshift profile picture
Thank you for confirming :).
The default sorting algorithm is the same as the one you see in my screenshot. It goes to 4-bit quantization.

Yes, we can force 8 bit quantization, but it is more expensive. Do users want that? That's what I'm trying to figure out.
1
PayPerQ · 2w
I feel like 90%+ of folks don't even know the difference between 4 and 8 bit. Do customers "want" 8 bit without realizing they do? Unsure. Really no idea if the cost/quality tradeoff is worth it. The scientific community also seems to have little consensus on if it is something very impactful. Some ...