Reddit will block the Internet Archive

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 4 days ago

into_highest_invite@lemmygrad.ml · 4 days ago

protect against ai scraping that they can’t monetize even though it uses none of their own server time

FanofOatmeal [none/use name]@hexbear.net · 2 days ago

don’t they already?

whenever I looked for old reddit threads on internet archive they never showed up.

P1d40n3 [he/him]@hexbear.net · 4 days ago

But not LLM training 🤔

ThermonuclearEgg [she/her, they/them]@hexbear.net · 4 days ago

This could potentially destroy existing archived data

LargeAdultRedBook [none/use name]@hexbear.net · 2 days ago

How so? Do archive services not also archive content from linked CDNs?

ThermonuclearEgg [she/her, they/them]@hexbear.net · edit-2 2 days ago

Maybe I’m mistaken but I have heard the Internet Archive applies robots.txt retroactively