• pyre@lemmy.world
    link
    fedilink
    arrow-up
    13
    arrow-down
    4
    ·
    edit-2
    1 day ago

    it uses the result of your labor without compensation. it’s not theft of the copyrighted material. it’s theft of the payment.

    it’s different from piracy in that piracy doesn’t equate to lost sales. someone who pirates a song or game probably does so because they wouldn’t buy it otherwise. either they can’t afford or they don’t find it worth doing so. so if they couldn’t pirate it, they still wouldn’t buy it.

    but this is a company using labor without paying you, something that they otherwise definitely have to do. he literally says it would be over if they couldn’t get this data. they just don’t want to pay for it.

    • masterspace@lemmy.ca
      link
      fedilink
      English
      arrow-up
      6
      arrow-down
      10
      ·
      edit-2
      22 hours ago

      That information is published freely online.

      Do companies have to avoid hiring people who read and were influenced by copyrighted material?

      I can regurgitate copyrighted works as well, and when someone hires me, places like Stackoverflow get fewer views to the pages that I’ve already read and trained on.

      Are companies committing theft by letting me read the internet to develop my intelligence? Are they committing theft when they hire me so they don’t have to do as much research themselves? Are they committing theft when they hire thousands of engineers who have read and trained on copyrighted material to build up internal knowledge bases?

      What’s actually happening, is that the debates around AI are exposing a deeply and fundamentally flawed copyright system. It should not be based on scarcity and restriction but rewarding use. Information has always been able to flow freely, the mistake was linking payment to restricting it’s movement.

      • pyre@lemmy.world
        link
        fedilink
        arrow-up
        6
        arrow-down
        3
        ·
        23 hours ago

        it’s ok if you don’t know how copyright works. also maybe look into plagiarism. there’s a difference between relaying information you’ve learned and stealing work.

        • Grimy@lemmy.world
          link
          fedilink
          arrow-up
          5
          arrow-down
          3
          ·
          23 hours ago

          Training on publicly available material is currently legal. It is how your search engine was built and it is considered fair use mostly due to its transformative nature. Google went to court about it and won.

          • pyre@lemmy.world
            link
            fedilink
            arrow-up
            3
            arrow-down
            3
            ·
            22 hours ago

            can you point to the trial they won? I only know about a case that was dismissed.

            because what we’ve seen from ai so far is hardly transformative.

            • Grimy@lemmy.world
              link
              fedilink
              arrow-up
              5
              arrow-down
              2
              ·
              edit-2
              21 hours ago

              Sorry, I was talking about HiQ labs v. Linkedin. But there is Google v. Perfect 10 and Google v. Authors Guild that show how scrapping public data is perfectly fine and include the company in question.

              An image generator is trained on a billion images and is able to spit out completely new images on whatever you ask it. Calling it anything but transformative is silly, especially when such things as collage are considered transformative.

              • pyre@lemmy.world
                link
                fedilink
                arrow-up
                3
                arrow-down
                3
                ·
                21 hours ago

                eh, “completely new” is a huge stretch there. splicing two or ten movies together doesn’t give you an automatic pass.