Damus
Hoshino Lina (ζ˜ŸδΉƒγƒͺγƒŠ) 🩡 3D Yuri Wedding 2026!!! profile picture
Hoshino Lina (ζ˜ŸδΉƒγƒͺγƒŠ) 🩡 3D Yuri Wedding 2026!!!
@Hoshino Lina (ζ˜ŸδΉƒγƒͺγƒŠ) 🩡 3D Yuri Wedding 2026!!!
Why are AI people so monumentally *bad* at copyright?

I'm looking for ethical/copyright-safe training data sets. Common Corpus sells itself as that... but then I go read the paper and they include CC BY-SA scientific papers and GPL stuff from GitHub, and then in models trained on that dataset they proudly state:

> Only trained on open data under a permissible [sic] license [...] By design, all Pleias model are unable to output copyrighted content.

Um, no?? CC BY-SA is not public domain, it's a copyright license. You can't train on CC BY-SA content and then claim your model is any more copyright-safe than whatever Google and Meta are releasing. It just means you're violating the copyright of people releasing content under open licenses only.
2
Hoshino Lina (ζ˜ŸδΉƒγƒͺγƒŠ) 🩡 3D Yuri Wedding 2026!!! · 9w
So far the only one I've seen that credibly can claim to be free of copyright concerns is KL3M (it was created... by lawyers). Are there any others?
Мя οΏ½οΏ½ · 9w
nostr:nprofile1qy2hwumn8ghj7un9d3shjtnyd968gmewwp6kyqpq6tx08mwy9vkkjen5s8ahy9e3x5z4dmykefvs7u6wex0s02puuskqv750mt it's considered safe if copyright holders don't have money to sue you :tone_sarcasm: