Damus
Gigi · 1d
considering buying hardware to run everything locally. Would should I buy? #asknostr
librekitty profile picture
intel is seriously competitive for price-to-VRAM, but i don't know about compatibility

NVIDIA is usually the clear winner for performance, 5xxx series/blackwell has support for NVFP4 quantized models
https://developer.nvidia.com/blog/introducing-nvfp4-for-efficient-and-accurate-low-precision-inference/

but you could also do like, multiple 3090s or something

hope this helps
2
librekitty · 23h
you can also go the CPU route with tons of RAM, but inference speed will be terrible compared to GPU accellerated
zaytun · 13h
I think dual 3090s would be preferable to fx a dgx spark with regards to inference speed, no? vRAM speed is higher I believe. Downside is model size limit is obviously lower on 48 gb vRAM than 128gb unified of the dgx spark.