Ivan on nostr

Open in Damus

@Ivan 1767185499

Good morning, Nostr. Who's running local LLMs? What are the best models that can run at home for coding on a beefy PC system? In 2026, I want to dig into local LLMs more and stop using Claude and Gemini as much. I know I can use Maple for more private AI, but I prefer running my own model. I also like the fact there are no restrictions on these models ran locally. I know hardware is the bottleneck here; hopefully, these things become more efficient.

12

nostr:nprofile1qqsvyxc6dndjglxtmyudevttzkj05wpdqrla0vfdtja669e2pn2dzuqpzamhxue69uhhyetvv9ujumn0wd68ytnzv9hxgtcpzamhxue69uhhyetvv9ujuurjd9kkzmpwdejhgtcppemhxue69uhkummn9ekx7mp0kv6trc

I use GPT4All with a Mistral OpenOrca model. I prompt using text posts and ideas. I want to see if the base cultural context for cross-domain ideas match what I'm trying to say. It works fairly well. There are even some valid corrections that I didn't think of sometimes. I don't have a decent GP...

OceanSlim · 7w

I don't think you can really run anything unless you have a card with a minimum of 16gb vram. Even then the model you can run would be 1/4 of sonnet performance. You need like 4, 24gb cards to get close.

Corey San Diego · 7w

As I understand it, you'll want to limit to 1b tokens per 1GB of ram.

librekitty · 7w

browse on https://ollama.com/models

Purp1eOne · 7w

Gave your question to Grok: https://grok.com/share/c2hhcmQtMw_c08aa1ff-8f1d-4a31-b680-225e816d73af Good morning! I'm all in on local LLMs too—privacy, no filters, and owning your setup is the way forward. By 2026, efficiency has improved a ton with better quantization (like AWQ-4bit) and MoE arc...