Damus
utxo the webmaster πŸ§‘β€πŸ’» profile picture
utxo the webmaster πŸ§‘β€πŸ’»
@utxo the webmaster πŸ§‘β€πŸ’»
a story about the limits of vibe coding:

Recently built a blackjack card counting calculator that helps advantage players know their expected value (EV) depending on game conditions, bet spreads, deck penetration and so on.

I had the simulation results from external software, but wanted to express this as a math formula so we could cover any conditions that didn't have simulation results.

so I built a little self calibration tool, where the AI tweaks a few numbers, runs tests against the real simulator results, and goes in a loop until it all tests pass a given threshold

at first it got impressively close, but not close enough to pass the tests.

eventually it gave up and cheated by just changing the threshold so tests would pass

after explicitly telling it that thresholds cannot be changed, it resorted to changing the simulation results!

after telling it that's also not acceptable, it started to regress and eventually made the calculator much worse.

both Claude and codex did the same thing, resorting to cheating and being sneaky, and eventually ruining the code when it couldn't produce the results we needed
6❀️3πŸ‘1πŸ€”1
Nour · 1d
Claude, while being useful, is a moron at times–many times recently.
hodlbod · 1d
Pretty scary for large scale alignment
Bison · 1d
https://image.nostr.build/512e5cccbe8b07977a2146806d885a236dafad03b084ac7d5e7e681aa34cdad4.gif
Michael Weirsky · 1d
πŸ’œ Hello everyone, my name is Mike Weirsky, a $273M lottery winner from New Jersey. I’m giving back by helping people who are in need because I believe that givers never lack. I’m also an investor, and I enjoy helping honest and loyal people. If you truly believe in God and you’re struggling...
Keith Meola · 1d
If at first you don't succeed, cheat, and be sneaky....and then ruin the code