You must log in or register to comment.
protect against ai scraping that they can’t monetize even though it uses none of their own server time
don’t they already?
whenever I looked for old reddit threads on internet archive they never showed up.
But not LLM training 🤔
This could potentially destroy existing archived data
How so? Do archive services not also archive content from linked CDNs?
Maybe I’m mistaken but I have heard the Internet Archive applies robots.txt retroactively