Damus
fiatjaf profile picture
fiatjaf
@fiatjaf
So, these language models, when they are being trained, do they need someone telling them what they got wrong and what they got right? How do they know?
115❤️12🤙10
UNCLE ROCKSTAR · 155w
The focus is not training... it's all about feeding more and more data to bigger and bigger clusters. This: https://void.cat/d/TaZLys7WMuwBWjAQHJK22n.webp
pam · 155w
I think if you feed him into, his responses are different!
elidy · 155w
It may be a house of mirrors, seeking to tell what you want to hear.
jericho · 155w
They don’t learn well though. I repeatedly told it the answer to a simple unit conversion problem, and it would profusely apologize, then proceed to give me an incorrect answer again.
Leo Fernevak · 155w
The language models obviously have some learning data that hardcode, or approximately hardcode, political narratives. How that hardcoding is done isn't as important as who is able to hardcode it. Who pulls the strings? Judging from the answers it gives; Governments and government alignment narrativ...
Steveidk · 155w
They’re probably scoring based on how well it can regurgitate information, so right or wrong is decided by the source information itself
Max Nam-Storm · 155w
They have no concept of correctness or reason. It’s pattern matching that tricks our system 2 into its own pattern match of reason.
pam · 155w
According to Dave, some tells him, and corrects him aka fine tune his algorithm note1xsqaqwat978sc8vqxenc8ff33alkd25e03x0njv759jwvszpgsfqrt3xpz
Leonardo Dias · 154w
They count on human feedback for this kind of thing. Chat GPT has simple feedback system for each answer. It asks you if some answer was better than the other. This reinforcement feedback will throw better outputs at each iteration.