• @Teanut@lemmy.world
    link
    fedilink
    English
    1627 days ago

    I hate to break it to you, but if you’re running an LLM based on (for example) Llama the training data (corpus) that went into it was still large parts of the Internet.

    The fact that you’re running the prompts locally doesn’t change the fact that it was still trained on data that could be considered protected under copyright law.

    It’s going to be interesting to see how the law shakes out on this one, because an artist going to an art museum and doing studies of those works (and let’s say it’s a contemporary art museum where the works wouldn’t be in the public domain) for educational purposes is likely fair use - and possibly encouraged to help artists develop their talents. Musicians practicing (or even performing) other artists’ songs is expected during their development. Consider some high school band practicing in a garage, playing some song to improve their skills.

    I know the big difference is that it’s people training vs a machine/LLM training, but that seems to come down to not so much a copyright issue (which it is in an immediate sense) as a “should an algorithm be entitled to the same protections as a person? If not, what if real AI (not just an LLM) is developed? Should those entities be entitled to personhood?”

    • @nexussapphire@lemm.ee
      link
      fedilink
      English
      -127 days ago

      I hate to break it to you but not all machine learning is llms based. I’ve been messing with neural based tts from a small project called piper. I’m looking into an image recognition neural network to write software for and train myself. I might try writing it myself for fun 🤔

      I’m not interested in anything that uses stolen data like that so my options are limited and relegated to incredibly focused single purpose tools or things I make myself with the tools available.

      I’d love to play with image generation and large language models but until all the legal stuff is worked out and individuals get paid for their work I’m not touching it.

      To me it’s as cut and dry as this. If it’s the difference between an individual becoming their own boss/making a better living and a corporation growing their market cap I’ll always choose the individual. I know there’s a possibility of that growth resulting in more jobs but I’d rather have an environment where small businesses open breed competition and overall improve everyone’s life. Let’s not give the keys over to companies like Microsoft and close more doors.

      I don’t care about the discussion of true AI having rights. It’s only going to be used to make the wealthy wealthier.

      • @hellofriend@lemmy.world
        link
        fedilink
        English
        927 days ago

        All LLMs are based on neural networks. Furthermore, all neural networks need training, regardless of whether they’re an LLM or some other form of machine learning. If you want to ensure there’s no stolen material used in the neural net then you have to train it yourself with material that you have the copyright to.

            • @nexussapphire@lemm.ee
              link
              fedilink
              English
              5
              edit-2
              26 days ago

              Sorry I thought you were being a smartass and just skimmed through it. Truly my bad.

              Edit: it’s hard to tell intention sometimes and I really do appreciate you summarizing what I said. It’s true and a more approachable answer than what I gave.

      • @nexussapphire@lemm.ee
        link
        fedilink
        English
        327 days ago

        Sorry I feel strongly about this. Play with it all you want it’s really cool shit! But please don’t pay for access to it and if you need some art or a professional write-up please just pay someone to do it.

        It’ll mean so much to your fellow man in these uncertain times and the quality will be so much better.