br1zll3 on nostr

Realizing that my coding efforts with lower-tier models aren't worth pushing the code.

I started using Sonnet 3.5 and worked my way up to Qwen 3.6-plus. With each newer iteration I scanned, I noticed attempts to fix security issues until Minimax 2.7 and Mimo-v2-Pro. I thought we were making progress since they identified and resolved issues fairly quickly imo.

I reached a point with my code where I could utilize other models, scan for issues, and receive feedback that the code was good, even while not using top-tier models. This seemed to validate that we were progressing, in my opinion.

Then came Qwen 3.6-plus. I scanned it, and once again, it found a ton of vulnerabilities. Industry tests indicate that the model is pretty weak in terms of security compared to top-tier models.

I guess it's better to wait to code more advanced projects until next year or so.