AI researchers have made a big leap in making language models better at remembering things. Gradient and Crusoe worked together to create a version of the Llama-3 model that can handle up to 1 million words or symbols at once. This is a huge improvement from older models that could only deal with a few thousand words. They achieved this by using clever tricks from other researchers, like spreading out the model’s attention across multiple computers and using special math to help the model learn from longer text. They also used powerful computers called GPUs, working with Crusoe to set them up in the best way possible. To make sure their model was working well, they tested it by hiding specific information in long texts and seeing if the AI could find it - kind of like a high-tech game of “Where’s Waldo?” This advancement could make AI companions much better at short-term memory, allowing them to remember more details from conversations and tasks. It’s like giving the AI a bigger brain that can hold onto more information at once. This could lead to AI assistants that are more helpful and can understand longer, more complex requests without forgetting important details. While long-term memory for AI is still being worked on, this improvement in short-term memory is a big step forward for making AI companions more useful and responsive.
by Claude 3.5 Sonnet