MIT's Computer Science and Artificial Intelligence Laboratory researchers recently conducted a research to find out how different LLMs perform memorization and reasoning skills. Most LLMs go through complex training so they can perform complex tasks. The researchers gave different basic tasks to AI models like GPT-4 and Claude which were opposite to their default tasks, the tasks they were trained on.
The researchers developed some tasks which were entirely new to these two LLMs but these tasks were made according to the capabilities of these AI models. Different logical, evaluating and arithmetic tasks were designed for them. When large language models are trained with arithmetic tasks, they are mostly given arithmetic in base 10 so when users interact with them, the LLMs give the impression that they are good at arithmetic tasks. But they are unable to perform well in all arithmetic bases. The research showed that LLMs can just perform common tasks which are consistent and do not have any generalization. This pattern was the same when these LLMs were given altered chess problems, chord fingering and spatial reasoning.
Despite all this, the study has some limitations as it just experimented with specific tasks and not real life problems and challenges. But the research shows that AI models are not as capable as humans may think they are. They just do what they are trained about and cannot perform well if they are given tasks which were not in their training.
Image: DIW-Aigen
Read next: Researchers Found Out that Writers Can Generate Interesting, Enjoyable and Stories with Good Plot Lines Using AI
The researchers developed some tasks which were entirely new to these two LLMs but these tasks were made according to the capabilities of these AI models. Different logical, evaluating and arithmetic tasks were designed for them. When large language models are trained with arithmetic tasks, they are mostly given arithmetic in base 10 so when users interact with them, the LLMs give the impression that they are good at arithmetic tasks. But they are unable to perform well in all arithmetic bases. The research showed that LLMs can just perform common tasks which are consistent and do not have any generalization. This pattern was the same when these LLMs were given altered chess problems, chord fingering and spatial reasoning.
Despite all this, the study has some limitations as it just experimented with specific tasks and not real life problems and challenges. But the research shows that AI models are not as capable as humans may think they are. They just do what they are trained about and cannot perform well if they are given tasks which were not in their training.
Image: DIW-Aigen
Read next: Researchers Found Out that Writers Can Generate Interesting, Enjoyable and Stories with Good Plot Lines Using AI