• Alexstarfire@lemmy.world
    link
    fedilink
    English
    arrow-up
    83
    arrow-down
    1
    ·
    5 days ago

    Pot calling the kettle black. Funny how they have a problem with AI accessing what they perceive as their data. Fucking moochers.

    • PhilipTheBucket@piefed.social
      link
      fedilink
      English
      arrow-up
      5
      ·
      5 days ago

      “They can’t monetize our users’ self-created content for ridiculously exploitative gains. Only we can monetize(*) our users’ self-created content for ridiculous exploitative gains!”

      (* Well, try to monetize, they haven’t actually got it to work yet)

  • pelespirit@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    44
    arrow-down
    1
    ·
    5 days ago

    Reddit says that it has caught AI companies scraping its data from the Internet Archive’s Wayback Machine, so it’s going to start blocking the Internet Archive from indexing the vast majority of Reddit. The Wayback Machine will no longer be able to crawl post detail pages, comments, or profiles; instead, it will only be able to index the Reddit.com homepage, which effectively means IA will only be able to archive insights into which news headlines and posts were most popular on a given day.

    Reddit is interesting right now. I took a look last week at the front page and there was an obvious post against trump. I checked inside and most of the comments were generic or something? Hard to pinpoint. It reminded me of conservative or td. I think they were bots? Here on lemmy, you might not get your post on the all page, but it seems like mostly real people.

    • M137@lemmy.world
      link
      fedilink
      English
      arrow-up
      9
      ·
      5 days ago

      Yeah, I have visited Reddit a handful of times the last couple of years because there was no active community here for something I needed help with or to find more info about. And it’s such a weird place now, all comments look like they’re from bots, children or just incredibly dumb people. It’s like a weird fever dream, very disturbing.

    • Cid Vicious@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      5
      ·
      5 days ago

      It seems unlikely that there’s a lot of people botting Lemmy right now just because…why bother? But I doubt there’s anything systemic in place to prevent it, and I’d imagine that the decentralized nature would make it really easy to skirt attempts to block it.

      • pelespirit@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        8
        ·
        5 days ago

        I’ve come across quite a few bots, but I’m a mod so it might be that that’s why. The mods and admins here on my instance are amazing, they catch a lot of crap. I think you’re witnessing well ran instances, not an absence of bots.

          • pelespirit@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            4
            ·
            5 days ago

            They don’t show up on the well ran ones, so that only matters if you’re on the lesser instance. Those instances get blocked pretty quickly.

            My only gripe with lemmy is that the huge instance controls what is on the all page and what is seen in their lemmy user’s accounts. I think the biggest is owned by reddit or meta or some other place like that. Most handle the bots pretty well.

      • sp3ctr4l@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        5
        ·
        5 days ago

        We had the whole… ‘Fediverse Girl’ thing, a while back now… somebody stole someone’s pic and was was dming people as a sort of brute force catfish attempt…

        But I haven’t seen too much of anything else?

  • UnderpantsWeevil@lemmy.world
    link
    fedilink
    English
    arrow-up
    19
    arrow-down
    2
    ·
    edit-2
    5 days ago

    Honestly for the best, given how the site has degraded over the last decade.

    Internet Archive doesn’t need a thousand bots posting slurs at one another for eternity

  • mindbleach@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    8
    ·
    5 days ago

    As I’ve been saying since Pictures For Sad Children disappeared: the Internet Archive should do it anyway.

    Your website is public. Anyone can see it. Fuck off pretending otherwise.

  • CluckN@lemmy.world
    link
    fedilink
    English
    arrow-up
    4
    ·
    5 days ago

    What a weird pay model. Is their plan for AI companies to pay insane API fees to scrape data from Reddit? Why would anyone pay when every other model scrapes it for free?

  • paraphrand@lemmy.world
    link
    fedilink
    English
    arrow-up
    3
    ·
    5 days ago

    Cool. Block links to it and other archives too.

    That’s only fair. (And it’ll piss off reddit users)