ChatGPT's hallucination problem is getting worse according to OpenAI's own tests and nobody understands why

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 1 month ago

ChatGPT's hallucination problem is getting worse according to OpenAI's own tests and nobody understands why

CriticalResist8@lemmygrad.ml · 1 month ago

a Markov chain predicts the next state based on the current state. If today is sunny, how likely is it that tomorrow will be rainy? Mathematically, this can be reduced to a Markov chain (so we don’t have to take into account the season, weather patterns or anything like that for this example).

But a Markov chain isn’t just saying how likely it is to be rainy on a given day, but how likely it is to be rainy tomorrow based on today. If today is sunny, there’s a let’s say 70% chance that tomorrow will be rainy. If today is rainy, there’s a 40% chance that tomorrow will be rainy (and conversely a 60% chance that tomorrow will be sunny because possible states must always equal 100%).

Autocorrect works similarly. It predicts the next word based on the current word you’ve typed out. LLMs are kinda glorified markov chains because they also predict words (called tokens, which are about 3 to 4 characters) but they do it over a much larger “current state”, which is the chat history, custom instructions if you gave any on chatgpt, etc. The context that is passed on with your prompt consists of several tokens and the AI generates one token at a time until little by little it’s formed a full response that it outputs.

In this way the markov chain of LLM is if I give it the sentence “Hello! My name is” for example, it will predict which token is the most likely to follow and should output it. We can assume this should be a name but truthfully we don’t know the exact probabilities of the next state. If I give it “Hello, my name is” - changing just one character might also change the prediction weighting. I say “might” because AI is a black box and we don’t really see what happens when the data passes through the neurons.

However if you send that sentence to chatGPT it will correctly tell you that your message got cut off and asks you to finish it. They do some post-production fine-tuning to get it to do that. Compare to deepseek without the reasoning model: