Damus
utxo the webmaster 🧑‍💻 · 1w
With concurrent requests I'm getting 200 t/s with 4x RTX 3090 running qwen3.6 MoE 4 quant via vllm