Damus
Ivan profile picture
Ivan
@Ivan
Good morning, Nostr. Who's running local LLMs? What are the best models that can run at home for coding on a beefy PC system? In 2026, I want to dig into local LLMs more and stop using Claude and Gemini as much. I know I can use Maple for more private AI, but I prefer running my own model. I also like the fact there are no restrictions on these models ran locally. I know hardware is the bottleneck here; hopefully, these things become more efficient.
12
Jay · 7w
nostr:nprofile1qqsvyxc6dndjglxtmyudevttzkj05wpdqrla0vfdtja669e2pn2dzuqpzamhxue69uhhyetvv9ujumn0wd68ytnzv9hxgtcpzamhxue69uhhyetvv9ujuurjd9kkzmpwdejhgtcppemhxue69uhkummn9ekx7mp0kv6trc
o · 7w
I use GPT4All with a Mistral OpenOrca model. I prompt using text posts and ideas. I want to see if the base cultural context for cross-domain ideas match what I'm trying to say. It works fairly well. There are even some valid corrections that I didn't think of sometimes. I don't have a decent GP...
OceanSlim · 7w
I don't think you can really run anything unless you have a card with a minimum of 16gb vram. Even then the model you can run would be 1/4 of sonnet performance. You need like 4, 24gb cards to get close.
Corey San Diego · 7w
As I understand it, you'll want to limit to 1b tokens per 1GB of ram.
librekitty · 7w
browse on https://ollama.com/models
Purp1eOne · 7w
Gave your question to Grok: https://grok.com/share/c2hhcmQtMw_c08aa1ff-8f1d-4a31-b680-225e816d73af Good morning! I'm all in on local LLMs too—privacy, no filters, and owning your setup is the way forward. By 2026, efficiency has improved a ton with better quantization (like AWQ-4bit) and MoE arc...