fiatjaf on nostr

Open in Damus

@fiatjaf 1683845824

So, these language models, when they are being trained, do they need someone telling them what they got wrong and what they got right? How do they know?

115❤️12🤙10

UNCLE ROCKSTAR · 155w

The focus is not training... it's all about feeding more and more data to bigger and bigger clusters. This: https://void.cat/d/TaZLys7WMuwBWjAQHJK22n.webp

I think if you feed him into, his responses are different!

It may be a house of mirrors, seeking to tell what you want to hear.

jericho · 155w

They don’t learn well though. I repeatedly told it the answer to a simple unit conversion problem, and it would profusely apologize, then proceed to give me an incorrect answer again.

Leo Fernevak · 155w

The language models obviously have some learning data that hardcode, or approximately hardcode, political narratives. How that hardcoding is done isn't as important as who is able to hardcode it. Who pulls the strings? Judging from the answers it gives; Governments and government alignment narrativ...

Steveidk · 155w

They’re probably scoring based on how well it can regurgitate information, so right or wrong is decided by the source information itself

Max Nam-Storm · 155w

They have no concept of correctness or reason. It’s pattern matching that tricks our system 2 into its own pattern match of reason.

According to Dave, some tells him, and corrects him aka fine tune his algorithm note1xsqaqwat978sc8vqxenc8ff33alkd25e03x0njv759jwvszpgsfqrt3xpz

Leonardo Dias · 154w

They count on human feedback for this kind of thing. Chat GPT has simple feedback system for each answer. It asks you if some answer was better than the other. This reinforcement feedback will throw better outputs at each iteration.