Moon
· 2w
What options exist for one to host-self an AI model AND the associated compute for inference? What is required? What is the cost (generally / directionally)?
#AskNostr
It works fine for me on an NVIDIA 4080 RTX, you need ollama to host download and host the models and provide an api for inference, then you need a UI to imteract which could be openClaw or openWebUI or your own vibecoded think, qwen3:14b works well for agentic stuff, gpt-oss: 20b woeks a little better in some situations.