Apple Study Reveals Critical Flaws in AI's Logical Reasoning Abilities
Apple's AI research team has uncovered significant weaknesses in the reasoning abilities of large language models, according to a newly published study.
![Apple Silicon AI Optimized Feature Siri 1](https://images.macrumors.com/t/EnlP5X_8GF0tHQBAvPB0cvKS8a0=/400x0/article-new/2024/04/Apple-Silicon-AI-Optimized-Feature-Siri-1.jpg?lossy)
The study, published on arXiv, outlines Apple's evaluation of a range of leading language models, including those from OpenAI, Meta, and other prominent developers, to determine how well these models could handle mathematical reasoning tasks. The findings reveal that even slight changes in the phrasing of questions can cause major discrepancies in model performance that can undermine their reliability in scenarios requiring logical consistency.
Apple draws attention to a persistent problem in language models: their reliance on pattern matching rather than genuine logical reasoning. In several tests, the researchers demonstrated that adding irrelevant information to a question—details that should not affect the mathematical outcome—can lead to vastly different answers from the models.
One example given in the paper involves a simple math problem asking how many kiwis a person collected over several days. When irrelevant details about the size of some kiwis were introduced, models such as OpenAI's o1 and Meta's Llama incorrectly adjusted the final total, despite the extra information having no bearing on the solution.
We found no evidence of formal reasoning in language models. Their behavior is better explained by sophisticated pattern matching—so fragile, in fact, that changing names can alter results by ~10%.
This fragility in reasoning prompted the researchers to conclude that the models do not use real logic to solve problems but instead rely on sophisticated pattern recognition learned during training. They found that "simply changing names can alter results," a potentially troubling sign for the future of AI applications that require consistent, accurate reasoning in real-world contexts.
According to the study, all models tested, from smaller open-source versions like Llama to proprietary models like OpenAI's GPT-4o, showed significant performance degradation when faced with seemingly inconsequential variations in the input data. Apple suggests that AI might need to combine neural networks with traditional, symbol-based reasoning called neurosymbolic AI to obtain more accurate decision-making and problem-solving abilities.
Popular Stories
The end of an 18-year era is on the horizon for the iPhone.
Apple reportedly plans to announce a new iPhone SE as soon as next week, and the device is expected to feature a full-screen design with Face ID, instead of a Touch ID home button. That means Apple will no longer sell any new iPhone models with a home button, for the first time since the original iPhone launched.
The home button...
Oppo has confirmed a February 20 global launch for its Find N5, which the company claims is the world's thinnest device in the foldable phone category. The phone is expected to be re-branded as the OnePlus Open 2 in the US.
The Chinese vendor has been teasing the device in the last few weeks, touting its waterproofing and nearly invisible display crease, and highlighting its thinness by compa...
There continue to be signs of a new MacBook Air with an M4 chip, indicating that we could see the machine launch in the not too distant future. A private account on X today shared the identifiers that the MacBook Air will use, and those identifiers correspond to the M4 chip.
According to the source, both the 13-inch MacBook Air and the 15-inch MacBook Air will be equipped with Apple's...
If you pay for iCloud storage on your iPhone, Apple has a new perk for you, at no additional cost.
iCloud+ is the official name for Apple's paid iCloud storage plans, which range from 50GB for $0.99 per month to 12TB for $59.99 per month in the United States. iCloud+ plans already come with multiple perks for free, such as Hide My Email and HomeKit Secure Video, and now there is another one...
Apple today released macOS Sequoia 15.3.1, a minor update to the macOS Sequoia operating system that came out last September. macOS 15.3.1 comes a few weeks after the launch of macOS Sequoia 15.3.
Mac users can download the macOS Sequoia update through the Software Update section of System Settings. Apple has also released macOS 13.7.4 and macOS 14.7.4 for those who are...
Apple has yet to release any new devices in 2025, but at least two new products are expected to be announced next week, according to rumors.
Below, we outline the new Apple products that are likely to be unveiled next week.
iPhone SE 4
Apple plans to announce the long-rumored iPhone SE 4 as soon as next week, according to Bloomberg's Mark Gurman.
The new iPhone SE is rumored to...
Apple today increased its estimated trade-in values for select Mac models in the United States, with the full changes outlined below.
Apple says the extra trade-in credit for select Macs is available with the purchase of an eligible new Apple device through April 2.
The trade-in values increased by between $10 and $50.
Model
New Value
Old Value
MacBook Pro
Up to $925
...
iOS 18.3 was released last month, so the first iOS 18.4 beta should be coming soon. iOS 18.4 is expected to be a more substantial update for the iPhone, with several new features and changes related to Apple Intelligence and beyond.
Apple's website suggests that iOS 18.4 will be released in April, following beta testing. Below, we outline what to expect from the update so far.
Apple...