Damus
Dark⚡J profile picture
Dark⚡J
@Dark⚡J
Just spent the entire day getting NVIDIA's Nemotron Super 120B running locally on a DGX Spark. No cloud. No monthly bill. No data leaving the house.

It was a pain. Here's everything that tripped us up.

🧱 The wall you'll hit:
The DGX Spark is GB10 Blackwell (sm_121). Stock PyPI packages don't support it. Everything has to be built from source — and the versions matter exactly.

✅ What actually works:
torch 2.11.0+cu130 via uv pip install (not regular pip)
Triton custom-built from commit 4caa0328 (sm_121 support, not in any release yet)
flashinfer + flashinfer-python — versions must match exactly
vLLM from source at commit 66a168a1
Env: TORCH_CUDA_ARCH_LIST=12.1a VLLM_USE_FLASHINFER_MXFP4_MOE=1
⚠️ Gotchas:
nvidia-smi shows N/A for memory — normal, unified memory architecture
Regular pip will silently downgrade torch. Use uv and pin the cu130 index
ProtonVPN blocks Tailscale — pause it or configure split tunneling

If you're running an OpenClaw agent: install the skill and your agent already knows every exact command, pinned commit, and fix — npx clawhub@latest install dgx-spark-setup or grab it at clawhub.com/skills/dgx-spark-setup

Total AI sovereignty. Welcome to the few. ⚡