Granted, I really don’t know much about how all this works, but the thought occurred to me that Lemmy - as wonderfully open as it is, and without any kind of ‘disappearing messages’ or other privacy protecting functionality - is basically a smorgasbord for AI scrapers. Or am I (hopefully) wrong about this?
Yes, polluting data sets is a way to combat unethical LLMs, but there’s no practical way to publish something publicly while protecting it from data scrapers.