Damus
G Force G · 2w
Can you name a few? I am interested I'm testing out my GPU.
BitcoinconManolito profile picture
You can try mid size qwen 3.5 28b, for instance. But it will be useless unless you own over 50GB of vram. Let alone larger 80b models.
I'm sure llm training technology will improve overtime, using features such as MOE that reduces the computing requirements without compromising performance.
But it'll take time to get Claude 4.6 performance on a affordable local server.

2👍1
ethfi · 2w
Nothing to it
Alan Siefert · 1w
I’ve seen evidence that unified memory computers like Apple Studio and AMD Strix Halo can be clustered with RDMA over Ethernet to achieve usable tokens per second, like ~15 t/s with fairly large models. Of course these clusters are not cheap for most consumers especially at today’s hardware pric...