Damus
iefan 🕊️ profile picture
iefan 🕊️
@iefan 🕊️
Tested Grok 3, its “deepSearch” & “think” mode with premium+.

First thought: they’re clearly manipulating the benchmarks. It’s not the best model by any stretch, especially in math & code. That much I can say for sure, so benchmarks in those areas don’t make much sense.

XAI’s deep search is nowhere close to OpenAI’s deep research. In fact, I don’t think they’re even in the same category. XAI’s deep search is more like Perplexity’s but slightly better.

That said, Grok 3 isn’t a bad model. It’s actually better than I expected—way better than Grok 2, but there’s nothing extraordinary about it. They also didn’t open-source it, so it doesn’t add anything meaningful to the ecosystem.

Bonus: If you’re someone who wants to play with cutting-edge AI models & features, like models with chain-of-thought reasoning, grounded web search, Gemini 2 Pro, & the latest SOTA models before launch, with 2M-token context, all for free for personal, uou can use:

https://aistudio.google.com/

42❤️5
jb55 · 62w
You can completely game them by training on the benchmarks using RLHF. Considering now that we know elon blatantly lies about things i wouldn’t be that surprised.
Rhett Reisman · 62w
Seems like everything Elon touches (other than maybe SpaceX) is deceitfully manipulated like this