I know a lot of people want to interpret copyright law so that allowing a machine to learn concepts from a copyrighted work is copyright infringement, but I think what people will need to consider is that all that’s going to do is keep AI out of the hands of regular people and place it specifically in the hands of people and organizations who are wealthy and powerful enough to train it for their own use.
If this isn’t actually what you want, then what’s your game plan for placing copyright restrictions on AI training that will actually work? Have you considered how it’s likely to play out? Are you going to be able to stop Elon Musk, Mark Zuckerberg, and the NSA from training an AI on whatever they want and using it to push propaganda on the public? As far as I can tell, all that copyright restrictions will accomplish to to concentrate the power of AI (which we’re only beginning to explore) in the hands of the sorts of people who are the least likely to want to do anything good with it.
I know I’m posting this in a hostile space, and I’m sure a lot of people here disagree with my opinion on how copyright should (and should not) apply to AI training, and that’s fine (the jury is literally still out on that). What I’m interested in is what your end game is. How do you expect things to actually work out if you get the laws that you want? I would personally argue that an outcome where Mark Zuckerberg gets AI and the rest of us don’t is the absolute worst possibility.
Agreed. The AI is rolling already, we can’t stop it now. All we can do is make sure that this technology benefits everyone, not just coroprations.
We are coming to a reckoning not only of by who AI is used but how society is handled overall. We are getting to a point even intellectual work can be automated, and however spotty it might be now, AI will only get better at it. For the many, many people who will see their jobs automated, their biggest concern is not whether they will be allowed to use ChatGPT, it’s whether they will have any kind of livelihood.
We are used to the idea that automation frees people to work less strenuous, more satisfying jobs, but what are we being freed to if even artistic expression is taken away from the hands of people? Rethinking how AI and automation benefits everyone needs to be done in a much larger scale.
Seriously, the average person has two FAR more immediate problems than not being able to create their own AI:
-
Losing their livelihood to an AI.
-
Losing their life because an AI has been improperly placed in a decision making position because it was sold as having more capabilities than it actually has.
1 could be solved by severe and permanent economic reforms, but those reforms are very far away. 2 is also going to need legal restrictions on what jobs an AI can do, and restrictions on the claims that an AI company can make when marketing their product. Possibly a whole freaking government agency designated for certifying AI.
Right now, it’s in our best interest that AI production is slowed down and/or prevented from being deployed to certain areas until we’ve had a chance for the law to catch up. Copyright restrictions and privacy laws are going to be the most effective way to do this, because it will force the companies to go back and retrain on public domain and prevent them from using AI to wholesale replace certain jobs.
As for the average person who has the computer hardware and time to train an AI (bear in mind Google Bard and Open AI use human contractors to correct misinformation in the answers as well as scanning), there is a ton of public domain writing out there.
The endgame, though, is to stop scenario 1 and scenario 2, and the best way to do that is any way that forces the people who are making AI to sit down and think about where they can use the AI. Because the problem is not the speed of AI development, but the speed of corporate greed. And the problem is not that the average person LACKS access to AI, but that the rich have TOO much access to AI and TOO many horrible plans about how to use it before all the bugs have been worked out.
Furthermore, if they’re using people’s creativity to make a product, it’s just WRONG not to have permission or to not credit them.
Losing their life because an AI has been improperly placed in a decision making position because it was sold as having more capabilities than it actually has.
I would tend to agree with you on this one, although we don’t need bad copyright legislation to deal with it, since laws can deal with it more directly. I would personally put in place an organization that requires rigorous proof that AI in those roles is significantly safer than a human, like the FDA does for medication.
As for the average person who has the computer hardware and time to train an AI (bear in mind Google Bard and Open AI use human contractors to correct misinformation in the answers as well as scanning), there is a ton of public domain writing out there.
Corporations would love if regular people were only allowed to train their AIs on things that are 75 years out of date. Creative interpretations of copyright law aren’t going to stop billion- and trillion-dollar companies from licensing things to train AI on, either by paying a tiny percentage of their war chests or just ignoring the law altogether the way Meta always does, and getting a customary slap on the wrist. What will end up happening is that Meta, Alphabet, Microsoft, Elon Musk and his companies, government organizations, etc. will all have access to AIs that know current, useful, and relevant things, and the rest of us will not, or we’ll have to pay monthly for the privilege of access to a limited version of that knowledge, further enriching those groups.
Furthermore, if they’re using people’s creativity to make a product, it’s just WRONG not to have permission or to not credit them.
Let’s talk about Stable Diffusion for a moment. Stable Diffusion models can be compressed down to about 2 gigabytes and still produce art. Stable Diffusion was trained on 5 billion images and finetuned on a subset of 600 million images, which means that the average image contributes 2B/600M, or a little bit over three bytes, to the final dataset. With the exception of a few mostly public domain images that appeared in the dataset hundreds of times, Stable Diffusion learned broad concepts from large numbers of images, similarly to how a human artist would learn art concepts. If people need permission to learn a teeny bit of information from each image (3 bytes of information isn’t copyrightable, btw), then artists should have to get permission for every single image they put on their mood boards or use for inspiration, because they’re taking orders of magnitude more than three bytes of information from each image they use for inspiration on a given work.
Except an AI is not taking inspiration, it’s compiling information to determine mathematical averages.
A human can be inspired because they are a human being. A Large Language Model cannot. Stable Diffusion is not near the complexity of a human brain. Just because it does it faster doesn’t mean it’s doing it the same way. Human beings have free will and a host of human rights. A human being is paid for the work they do, an AI program’s creator is paid for the work it did. And if that creator used copyrighted work, then he should be having to get permission to use it, because he’s profitting off this AI program.
I would tend to agree with you on this one, although we don’t need bad copyright legislation to deal with it, since laws can deal with it more directly. I would personally put in place an organization that requires rigorous proof that AI in those roles is significantly safer than a human, like the FDA does for medication.
I would too, but we need TIME to get that done and right now, lawsuits will buy us time. That was the point of my comment.
Except an AI is not taking inspiration, it’s compiling information to determine mathematical averages.
The AIs we’re talking about are neural networks. They don’t do statistics, they don’t have databases, and they don’t take mathematical averages. They simulate neurons, and their ability to learn concepts is emergent from that, the same way the human brain is. Nothing about an artificial neuron ever takes an average of anything, reads any database, or does any statistical calculations. If an artificial neural network can be said to be doing those things, then so is the human brain.
There is nothing magical about how human neurons work. Researchers are already growing small networks out of animal neurons and using them the same way that we use artificial neural networks.
There are a lot of “how AI works” articles in there that put things in layman’s terms (and use phrases like “statistical analysis” and “mathematical averages”, and unfortunately people (including many very smart people) extrapolate from the incorrect information in those articles and end up making bad assumptions about how AI actually works.
A human being is paid for the work they do, an AI program’s creator is paid for the work it did. And if that creator used copyrighted work, then he should be having to get permission to use it, because he’s profitting off this AI program.
If an artist uses a copyrighted work on their mood board or as inspiration, then they should pay for that, because they’re making a profit from that copyrighted work. Human beings should, as you said, be paid for the work they do. Right? If an artist goes to art school, they should pay all of the artists whose work they learned from, right? If a teacher teaches children in a class, that teacher should be paid a royalty each time those children make use of the knowledge they were taught, right? (I sense a sidetrack – yes, teachers are horribly underpaid and we desperately need to fix that, so please don’t misconstrue that previous sentence.)
There’s a reason we don’t copyright facts, styles, and concepts.
Oh, and if you want to talk about something that stores an actual database of scraped data, makes mathematical and statistical inferences, and reproduces things exactly, look no further than Google. It’s already been determined in court that what Google does is fair use.
@IncognitoErgoSum Gonna need a source on Large Language Models using neural networks based on the human brain here.
EDIT: Scratch that. I’m just going to need you to explain how this is based on the human brain functions.
I’m willing to, but if I take the time to do that, are you going to listen to my answer, or just dismiss everything I say and go back to thinking what you want to think?
Also, a couple of preliminary questions to help me explain things:
What’s your level of familiarity with the source material? How much experience do you have writing or modifying code that deals with neural networks? My own familiarity lies mostly with PyTorch. Do you use that or something else? If you don’t have any direct familiarity with programming with neural networks, do you have enough of a familiarity with them to at least know what some of those boxes mean, or do I need to explain them all?
Most importantly, when I say that neural networks like GPT-* use artificial neurons, are you objecting to that statement?
I need to know what it is I’m explaining.
@IncognitoErgoSum I don’t think you can. Because THIS? Is not a model of how humans learn language. It’s a model of how a computer learns to write sentences.
If what you’re going to give me is an oversimplified analogy that puts too much faith in what AI devs are trying to sell and not enough faith in what a human brain is doing, then don’t bother because I will dismiss it as a fairy tale.
But, if you have an answer that actually, genuinely proves that this “neural” network is operating similarly to how the human brain does… then you have invalidated your original post. Because if it really is thinking like a human, NO ONE should own it.
In either case, it’s probably not worth your time.
If what you’re going to give me is an oversimplified analogy that puts too much faith in what AI devs are trying to sell and not enough faith in what a human brain is doing, then don’t bother because I will dismiss it as a fairy tale.
I’m curious, how do you feel about global warming? Do you pick and choose the scientists you listen to? You know that the people who develop these AIs are computer scientists and researchers, right?
If you’re a global warming denier, at least you’re consistent. But if out of one side of you’re mouth you’re calling what AI researchers talk about a “fairy tail”, and out of the other side of your mouth you’re criticizing other people for ignoring science when it suits them, then maybe you need to take time for introspection.
You can stop reading here. The rest of this is for people who are actually curious, and you’ve clearly made up your mind. Until you’ve actually learned a bit about how they actually work, though, you have absolutely no business opining about how policies ought to apply to them, because your views are rooted in misconceptions.
In any case, curious folks, I’m sure there are fancy flowcharts around about how data flows through the human brain as well. The human brain is arranged in groups of neurons that feed back into each other, where as an AI neural network is arranged in more ordered layers. There structure isn’t precisely the same. Notably, an AI (at least, as they are commonly structured right now) doesn’t experience “time” per se, because once it’s been trained its neural connections don’t change anymore. As it turns out, consciousness isn’t necessary for learning and reasoning as the parent comment seems to think.
Human brains and neural networks are similar in the way that I explained in my original comment – neither of them store a database, neither of them do statistical analysis or take averages, and both learn concepts by making modifications to their neural connections (a human does this all the time, whereas an AI does this only while it’s being trained). The actual neural network in the above diagram that OP googled and pasted in here lives in the “feed forward” boxes. That’s where the actual reasoning and learning is being done. As this particular diagram is a diagram of the entire system and not a diagram of the layers of the feed-forward network, it’s not even the right diagram to be comparing to the human brain (although again, the structures wouldn’t match up exactly).
But, if you have an answer that actually, genuinely proves that this “neural” network is operating similarly to how the human brain does… then you have invalidated your original post. Because if it really is thinking like a human, NO ONE should own it.
I think this is a neat point.
The human brain is very complex. The neural networks trained on computers right now are more like collections of neurons grown together in a petri dish, rather than a full human brain. They serve one function, say, recognizing or generating an image or calculating some probability or deciding on what the next word should be in a sequence. While the brain is a huge internetwork of these smaller, more specialized neural networks.
No, neural networks don’t have a database and they don’t do stats. They’re trained through trial and error, not aggregation. The way they work is explicitly based on a mathematical model of a biological neuron.
And when an AI is developed that’s advanced enough to rival the actual human brain, then yeah, the AI rights question becomes a real thing. We’re not there yet, though. Still just matter in petri dishes. That’s a whole other controversial argument.
The AIs we’re talking about are neural networks. They don’t do statistics, they don’t have databases, and they don’t take mathematical averages. They simulate neurons, and their ability to learn concepts is emergent from that, the same way the human brain is.
This is not at all accurate. Yes, there are very immature neural simulation systems that are being prototyped but that’s not what you’re seeing in the news today. What the public is witnessing is fundamentally based on vector mathematics. It’s pure math and there is nothing at all emergent about it.
If an artist uses a copyrighted work on their mood board or as inspiration, then they should pay for that, because they’re making a profit from that copyrighted work.
That’s not how copyright works, nor should it. Anyone who creates a mood board from a blank slate is using their learned experience, most of which they gathered from other works. If you were to write a book analyzing movies, for example, you shouldn’t have to pay the copyright for all those movies. You can make a YouTube video right now with a few short clips from a movie or quotes from a book and you’re not violating copyright. You’re just not allowed to make a largely derivative work.
So to clarify, are you making the claim that nothing that’s simulated with vector mathematics can have emergent properties? And that AIs like GPT and Stable Diffusion don’t contain simulated neurons?
Yes, and the math is all publicly documented.
Oh boy! Link, please!
-
Many things in life are a privilege for these groups. AI is no different.
I’m not sure what you’re getting at with this. It will only be a privilege for these groups of we choose to artificially make it that way. And why would you want to do that?
Do you want to give AI exclusively to the rich? If so, why?
I think he was just stating a fact.
For something to be a fact, it needs to actually be true. AI is currently accessible to everyone.
I disagree. I can barely run a 13B parameter model locally. Much less a 175B parameter model like GPT3. Or GPT4, whatever that model truly is. Or whatever behemoth of a model the NSA almost certainly has and just hasn’t told anyone about. I’ll eat my sock if the NSA doesn’t have a monster LLM along with a myriad of other special purpose models by now.
And even though the research has (mostly) been public so far, the resources needed to train these massive models is out of reach for all but the most privileged. We can train a GPT2 or GPT-Neo if we’re dedicated, but you and I aren’t training an open version of GPT4.
But you can run it.
I’ve got a commodity GPU and I’ve been doing plenty of work with local image generation. I’ve also run and fine-tuned LLMs, though more out of idle interest than for serious usage yet. If I needed to do more serious work, renting time on cloud computing for this sort of thing actually isn’t all that expensive.
The fact that the very most powerful AIs aren’t “accessible” doesn’t mean that AI in general isn’t accessible. I don’t have a Formula 1 racing car but automobiles are still accessible to me.
If we’re just talking about what you can do, then these laws aren’t going to matter because you can just pirate whatever training material you want.
But that is beside my actual point, which is that there is a practical real-world limit to what you, the little guy, and they, the big guys, can do. That disparity is the privilege that OP way back up at the top mentioned.
I have no idea what that original commenter’s opinion on copyright vs training is. Personally I agree with the OP-OP of the whole thread. Training isn’t copying, and even if it were the public interest outweighs the interests of the copyright holders in this regard. I’m just saying that in the real world there is a privilege that that the elites and ultra-corps have over us, regardless of what systems we set up unless capitalism and society as a whole is upended.
At this point we’re just bickering over semantics.
So clearly we do agree on most of this stuff, but I did want to point out a possibility you may not have considered.
If we’re just talking about what you can do, then these laws aren’t going to matter because you can just pirate whatever training material you want.
This depends on the penalty and how strictly it’s enforced. If it’s enforced like normal copyright law, then you’re right; your chances of getting in serious trouble just for downloading stuff are essentially nil – the worst thing that will happen to you is your ISP will three-strikes you and you’ll lose internet access. On the other hand, there’s a lot of panic surrounding AI, and the government might use that as an excuse to pass laws that would give people prison time for possessing one, and then fund strict enforcement. I hope that doesn’t happen, but with rumblings of insane laws that would give people prison time for using a VPN to watch a TV show outside of the country, I’m a bit concerned.
As for the parent comment’s motivations, it’s hard to say for sure with any particular individual, but I have noticed a pattern among neoliberals where they say things like “well, the rich are already powerful and we can’t do anything about it, so why try” or “having universal health care, which the rest of the first world has implemented successfully, is unrealistic, so why try” and so on. It often boils down to giving lip service to progressive social values while steadfastly refusing to do anything that might actually make a difference. It’s economic conservatism dressed as progressivism. Even if that’s not what they meant (and it would be unwise of me to just assume that), I feel like that general attitude needs to be confronted.
AI is more than just ChatGPT.
When we talk about reinterpreting copyright law in a way that makes AI training essentially illegal for anything useful, it also restricts smaller and potentially more focused networks. They’re discovering that smaller networks can perform very well (not at the level of GPT-4, but well enough to be useful) if they’re trained in a specific way where reasoning steps are spelled out in the training.
Also, there are used nvidia cards currently selling on Amazon for under $300 with 24 gigs of ram and AI performance almost equal to a 3090, which puts group-of-experts models like a smaller version of GPT-4 within reach of people who aren’t ultra-wealthy.
There’s also the fact that there are plenty of companies currently working on hardware that will make AI significantly cheaper and more accessible to home users. Systems like ChatGPT aren’t always going to be restricted to giant data centers, unless (as some people really want) laws are passed to prevent that hardware from being sold to regular people.
I want to be clear that I don’t disagree with your premise and your assertion that AI training should be legal regardless of copyright of the training material. My only point was that the original commenter said the ultra-elites have privilege over us little guys, and he was right in that regard. I have no ideas how that plays into his opinion on this whole matter, only that what he said on its face is accurate.
I’ve been thinking along your line. My concern has been that dictatorships would violate the western copyright and would thus go further than the west and especially europeans, who are heading to very strict laws. It’s a nightmare scenario.
And your concern on the rich only makes sense to me, too.
You have not clearly defined the danger. You just said “ai is here”. Well, lawyers are here too and they have the law on their side. Also the ai will threaten their model, so they will probably have no mercy anyway and will work full time on the subject.
Wealthy and powerful corporations fear the law above anything else. A single parliament can shut down their activity better than anyone else on the planet.
Maybe you talk from the point of view of a corrupt country like the USA, but the EU parliament, which BTW doesn’t host any GAFAM, is totally ready to strike hard on the businesses founded on AI.
See, people doesn’t want to lose their job to a robot and they will fight for it. This induces a major threat to the ai: people destroying data centers. They will do it. Their interests will converge with the interest of the people caring about global warming. Don’t take the ai as something inevitable. An ai has a high dependency on resources and generates unemployment and pollution, and a questionable value.
An AI requires:
Energy
Water
High tech hardware
Network
Security
Stability
InvestmentsIt’s like a nuclear powerplant but more fragile. If an activist group takes down a datacenter hosting an ai, who will blame them? The jury will take turns to high five them.
Wow, you have this all planned out, don’t you?
If that’s what Europe is like, they’ll build their data centers somewhere else. Like the corrupt USA. Again, you’ll be taking away your access to AI, not theirs.
I don’t think the EU is so lawless as to allow blatant property destruction, and if it is, I can’t imagine such a lack of rule of law will do much for the EU’s future economic prosperity.
I’m probably just a dumb hick American though.
The economic prosperity came bundled with an ecological debt, provoked by the overusage of oil. Oil is cheap and makes everything cheap. Remove oil and everything increase in price. The “prosperity” is behind us now. I don’t see how an AI described above would bring in term of prosperity.
There is a debate in France about the morality of acting against the law when it comes to protesting against global warming. And a datacenter is in the jurisdiction of the people fighting against global warming.
We should not take order for granted. Keep in mind that the temperature will ramp up slowly each year, destroying our environment a little bit more each year. When the time of sacrifice will come I bet that the AI will be very high on the list.
Why do you think people will build data centers in Europe when they can build them elsewhere?
Tell us, I don’t know. All I know is that when a data center will require more water than the environment can provide, there will be conflicts for water, and the people living around will protest. And the most active of them will pull the plug at night or funny stuff like that. Data centers are fragile things.
As the technology improves, data centers that run AI will require significantly less cooling. GPUs aren’t very power-efficient for doing AI stuff because they have to move a lot of data around from their memory to their processor cores. There are AI-specific cards being worked on that will allow the huge matrix multiplications to happen in place without that movement happening, which will mean drastically lower power and cooling requirements.
Also, these kinds of protestors are the same general group of people who stopped nuclear power from becoming a bigger player back in the 1960s and 70s. If we’d gone nuclear and replaced coal, we almost certainly wouldn’t be sitting here at the beginning of what looks to be a major global warming event that’s unlike anything we’ve ever seen before. It wouldn’t have completely solved the problem, but it would have bought us time. An AI may be able to help us develop ideas to mitigate global warming, and it seems ridiculous to me to go all luddite and smash the machines over what will be a minuscule overall contribution to it given the possibility that it could help us solve the problem.
But let’s be real here; these hypothetical people smashing the machines are doing it because they’ve bought into AI panic, not because they’re afraid of global warming. If they really want to commit acts of ecoterrorism, there are much bigger targets.
I can’t believe that you are blaming the green people! Those people are the one who consume the less and begged you to consume less. Did you do it? No, you didn’t. Had people like you listened then we wouldn’t be in our current situation. You wanted the ultimate comfort no matter what and you listened to nothing. We’ve been talking about greenhouse effect since the previous century.
You will never move a boat with nuclear, you will never move an airplane with nuclear, you will never fertilize a field with nuclear. Stop dreaming.
Also, these kinds of protestors are the same general group of people who stopped nuclear power from becoming a bigger player back in the 1960s and 70s. If we’d gone nuclear and replaced coal, we almost certainly wouldn’t be sitting here at the beginning of what looks to be a major global warming event that’s unlike anything we’ve ever seen before. It wouldn’t have completely solved the problem, but it would have bought us time.
Short sighted view of the problem. First there is not enough uranium for everyone.
Second, nuclear power is reserved to stable countries.
Third, there is no uranium in the EU, making it yet another tool for pressuring countries.
An AI may be able to help us develop ideas to mitigate global warming, and it seems ridiculous to me to go all luddite and smash the machines over what will be a minuscule overall contribution to it given the possibility that it could help us solve the problem.
HAHAHA!
“The AI will save us!”
Eat less meat! How hard is it to compute! So turn off your stupid AI and eat less meat. Do it now, stop eating meat.
You know exactly what to do, you just DONT WANT TO DO IT BECAUSE YOU ARE LAZY AND ADDICTED TO COMFORT.
If you don’t do what ten thousands of scientists are telling you to do right now then you will never do what a robot tells you to do. Your face when the AI will tell you to stop eating meat. “But this is not possible, we can’t do this, the AI is wrong! We need a bigger AI!!”
omg, the denial.
But let’s be real here; these hypothetical people smashing the machines are doing it because they’ve bought into AI panic, not because they’re afraid of global warming. If they really want to commit acts of ecoterrorism, there are much bigger targets.
Like the tires of your car.
You will never move a boat with nuclear,
I assume you haven’t heard of aircraft carriers and nuclear submarines.
Also, nuclear power can be stored in batteries and capacitors and then used to move electric vehicles (including boats, planes, and tractors), so I don’t know what the hell you’re even talking about.
Eat less meat! How hard is it to compute! So turn off your stupid AI and eat less meat. Do it now, stop eating meat.
I’ve actually cut my meat consumption way down.
That being said, a person using AI consumes an absolutely minuscule amount of power compared to a person eating a steak. One steak (~20kwh) is equivalent to about 60 hours of full time AI usage (300W for an nvidia A100 at max capacity), and most of the time a person spends using an AI is spent idling while they type and read, so realistically it’s a lot longer than that.
Again, your hypothetical data center smashers are going after AI because they hate AI, not because they care about the environment. There are better targets for ecoterrorism. Like my car’s tires, internet tough guy.
And a datacenter is in the jurisdiction of the people fighting against global warming.
Let me know when a judge agrees with this. Hell, I’ll shoot $100 to you.
It’s gonna be shaky anyway.
deleted by creator