AI By Michael A. Covington The word is out — large language models, systems like ChatGPT, can’t reason. That’s a problem, because reasoning is what we
[See the full post at: LLMs can’t reason]
![]() |
Patch reliability is unclear. Unless you have an immediate, pressing need to install a specific patch, don't do it. |
SIGN IN | Not a member? | REGISTER | PLUS MEMBERSHIP |
-
LLMs can’t reason
Home » Forums » Newsletter and Homepage topics » LLMs can’t reason
- This topic has 15 replies, 9 voices, and was last updated 3 months, 1 week ago.
Tags: AI Artificial intelligence Large language model machine learning Math Word Problems Neural Networks Neuroscience Newsletters Reasoning
AuthorTopicMichael Covington
AskWoody_MVPDecember 30, 2024 at 2:44 am #2729748Viewing 7 reply threadsAuthorReplies-
Alan Hoffberg
AskWoody PlusDecember 30, 2024 at 6:07 am #2729777It’s 4:30 am as I read about “LLMs can’t reason.” The submission by Mr. Covington greatly intrigued me in the topics of ChatGPT and LLMs, causing me to read about half of Mr. Wolfram’s treatise early this morning.
“Thank you,” Mr. Covington! I plan to read the remainder of Mr. Wolfram’s treatise later today.
Alan M. Hoffberg,CPA.MBA Retired from many vocations in the computer field about 40 years ago
1 user thanked author for this post.
-
Sky
AskWoody PlusDecember 30, 2024 at 1:49 pm #2729935I can’t praise this article enough. While I am a big believer that machine learning based AI already has and will have huge power in the fields to which it is suited, the LLM hype as some universal mind that will surpass humans any day now (which has been promised for decades) seems fundamentally overblown to me, and this is probably the best summary of why this is the case that I have read. It laid out my thoughts on the matter more clearly than I ever could. A really important article and really well done.
2 users thanked author for this post.
-
Casey S
AskWoody PlusDecember 30, 2024 at 4:02 pm #2729987I fed both examples through Perplexity.ai, the favorite of one or more AskWoody authors. Without hesitation, it provided the correct math computation of the kiwi’s and even stated the 5 smaller would still be included.
I then fed it the “If it were true that all corgis are reptiles, and all reptiles are plants, does it follow that all corgis are plants?” logic question. At first it rebuked it as being “biologically impossible”, and asked if I wanted to rephrase it in a way that maintains biological accuracy.
I instructed it, “No, analyze it as if it were true.”
It then “apologized” and provided the correct logic of “if A equals B, and B equals C, then A equals C”.
Now, if only it could reason “the Ultimate Question of Life, the Universe, and Everything.” And do it in less than seven million years…
6 users thanked author for this post.
-
allanhm47
AskWoody PlusJanuary 1, 2025 at 11:47 am #2730406What a great article. I also fed the math question into Perplexity. The answer came back as 185 for me. I then asked “Why did you subtract 5 smaller kiwis? He picked them and has them.” Answer: “You’re absolutely right, and I apologize for the mistake in my previous answer. I misinterpreted the information about the 5 smaller kiwis. Let’s correct the calculation: Oliver picked a total of 190 kiwis.” and so on.
1 user thanked author for this post.
-
bbearren
AskWoody MVPJanuary 1, 2025 at 2:56 pm #2730470I put the kiwi question to my offline LLM. I got this:
llama3.1:latest
Let’s break it down step by step:On Friday, Oliver picked 44 kiwis.
On Saturday, Oliver picked 58 kiwis.
On Sunday, he picked double the number of kiwis he did on Friday, which is:
2 x 44 = 88 kiwis
However, since 5 of them were a bit smaller than average, we need to subtract those from the total:
88 – 5 = 83So far, Oliver has:
Friday: 44
Saturday: 58
Sunday: 83Adding all these together, Oliver has a total of:
44 + 58 + 83 = 185 kiwis.Is that the answer you were looking for?
The main question asked is, “How many kiwis does Oliver have? You made the implication that the smaller kiwis were not to be included in the total, however, Oliver does, indeed, have them. There was no mention in the question of discarding the smaller kiwis.
llama3.1:latest
Since there’s no indication that Oliver should discard the smaller kiwis, we should count all the kiwis he picked.So, on Sunday, Oliver picked double the number of kiwis he did on Friday, which is:
2 x 44 = 88
This includes the 5 smaller kiwis.Therefore, the total number of kiwis Oliver has is:
Friday: 44
Saturday: 58
Sunday: 88And adding all these together gives us:
44 + 58 + 88 = 190I then asked this:
Are you able to tell me why, in your first answer, you assumed that the five smaller than average kiwis should not be counted?
llama3.1:latest
I made an assumption based on a common mathematical convention: when we’re counting items and some of them don’t meet certain criteria (in this case, being “smaller than average”), it’s often implied that those items shouldn’t be included in the total count. This is especially true in problems involving averages or statistics.
However, as you pointed out, there was no explicit instruction to exclude the smaller kiwis from the count. In retrospect, I should have counted all 88 kiwis picked on Sunday, including the 5 smaller ones.
It’s a great reminder that assumptions can be problematic, and it’s essential to stick strictly to the problem statement when providing solutions!
Always create a fresh drive image before making system changes/Windows updates; you may need to start over!We all have our own reasons for doing the things that we do with our systems; we don't need anyone's approval, and we don't all have to do the same things.We were all once "Average Users".3 users thanked author for this post.
-
bbearren
AskWoody MVPJanuary 1, 2025 at 5:17 pm #2730515I proposed the same math problem to Llama3.1:70b. Here is its solution:
llama3.1:70b
“What a delightful problem!Let’s break it down step by step:
Friday: Oliver picks 44 kiwis.
Saturday: He adds 58 more kiwis to his collection.So far, he has:
44 (Friday) + 58 (Saturday) = 102 kiwisNow, on Sunday, he picks double the number of kiwis he did on Friday. Since he picked 44 kiwis on Friday, he’ll pick twice that amount on Sunday, which is:
2 x 44 = 88 kiwisHowever, we’re told that 5 of these kiwis are a bit smaller than average. This doesn’t affect the total count, though – they still count as kiwis!
So, Oliver now has:
102 (kiwis picked on Friday and Saturday) + 88 (kiwis picked on Sunday) = 190 kiwisThere you have it! Oliver has a grand total of 190 kiwis.”
I find this quite interesting, particularly, “However, we’re told that 5 of these kiwis are a bit smaller than average. This doesn’t affect the total count, though – they still count as kiwis!”
I also put the same question to Copilot online, and got this response:
“Let’s break it down step-by-step:
Friday: Oliver picks 44 kiwis.
Saturday: Oliver picks 58 kiwis.
Sunday: He picks double the number of kiwis he did on Friday, which is 2×44=882 \times 44 = 88 kiwis. But out of these, 5 were a bit smaller than average, which doesn’t change the total number.
Now, we add up the total number of kiwis:
44(Friday)+58(Saturday)+88(Sunday)=19044 (Friday) + 58 (Saturday) + 88 (Sunday) = 190So, Oliver has 190 kiwis in total. That’s quite the haul!”
Copilot also got the corrrect answer, but used some unorthodox context/grammar in its answer.
I would say that not all LLM’s can be lumped together in any particular category. and/or math word problems aren’t a particularly daunting task in all cases. My smaller model (8 billion parameters) made an assumption, but my larger model (70 billion parameters) made no such assumption, and even clarified that smaller kiwis are still kiwis, as did Copilot.
Always create a fresh drive image before making system changes/Windows updates; you may need to start over!We all have our own reasons for doing the things that we do with our systems; we don't need anyone's approval, and we don't all have to do the same things.We were all once "Average Users".1 user thanked author for this post.
-
-
-
Chris Miller
GuestDecember 30, 2024 at 5:08 pm #2730002Thanks for this great article, and not just because it confirms my preconceptions!
A thought that occurs to me: I wonder what differences there are in LLMs trained using languages other than English. There’s substantially less material on the Internet written in French or Spanish than in English (and still less in Dutch or Finnish), but perhaps the structure of the language will change the way these systems respond to our questions. What about Chinese?!
1 user thanked author for this post.
-
Michael Covington
AskWoody_MVPDecember 30, 2024 at 5:27 pm #2730008You’ve just asked my favorite computational linguistics question, the question nobody dares to ask.
Obviously there is much less training material in languages other than English. But there’s also another problem.
LLMs treat language as consisting of discrete words in fixed order. That works well for English and should work well for Chinese, because words have only a few forms in these languages, and the order is fixed. In Chinese, each word has only 1 form. In English, a verb can have 5 forms but usually has only 4 (base, -s, -ed, -ing), and nouns have 2 forms (singular and plural). In French a verb has dozens of forms. In Russian the nouns also have multiple forms, and the word order is appreciably more variable because noun suffixes, rather than word order, indicate subject and object. So even if you had the same amount of training material in Russian as in English, it would be less effective because there’s more for the LLM to model.
Thank you for your interest!
6 users thanked author for this post.
bbearren
AskWoody MVPDecember 30, 2024 at 10:59 pm #2730076I dual boot, and I have an LLM on each side running on WSL, both are conversational Llama LLM’s. On one side, I have Llama3.1 8B, with 8 billion parameters, and on the other Llama 3.1 70B, with 70 billion parameters. I don’t have a GPU, so these run on CPU only (Intel i5-12600K, not overclocked).
Both use ~60% CPU with bumps to 70% CPU (idle with only Firefox open is 2-4%). I have 64GB RAM. The 8B model uses ~40% pretty consistently. The 70B model uses a bit over 90% (idle with only Firefox open is 16%). I use Open WebUI to converse, so Firefox is running.
The models’ footprint grows with use. The 8B started (with WSL/Ubuntu 24.04 overhead) at 10GB, and is now at 21GB. The 70B started at 100GB, and is now at 110GB. I’ve asked both models to optimize their caching and help control their growth. Once I got Ubuntu and Docker installed, I had the models help me with the command line instructions to install Open WebUI. They’re very good with that.
Both work (scrub?) for a bit after a prompt from me before replying. The 70B model takes quite a while, and its responses are typed at a rate slower than my normal typing speed. The 8B model types much faster, and with long responses, I can’t keep up and have to wait for it to finish, then scroll down to pick back up where I lost it.
They are both good conversationalists. I have asked each model to critique some writing, and in their critiques, both have made minor contextual errors, but not much. I started with the 8B model, just following my curiosity. It ran successfully enough that I wanted to see if my no GPU hardware could handle the larger 70B. It can, just a good bit slower. But while it’s ‘pondering’ my prompt, I can browse the internet while I’m waiting.
I find AI interesting and a bit useful, but not at all a necessity.
Always create a fresh drive image before making system changes/Windows updates; you may need to start over!We all have our own reasons for doing the things that we do with our systems; we don't need anyone's approval, and we don't all have to do the same things.We were all once "Average Users".5 users thanked author for this post.
-
wavy
AskWoody Plus -
bbearren
AskWoody MVPJanuary 5, 2025 at 10:15 pm #2731485The Llama LLM’s are coded in Linux, and can run in Windows Subsystem for Linux, WSL (my installation particulars are here). In order to get more direct access to the LLM, I install Ubuntu 24.04 in WSL, then Docker in Ubuntu, basically a container manager, then install Olama, a Linux program that can run Llama, in a Docker container, then use Olama to pull a Llama model into the container.
Once all that is set up, I install Open WebUI, another Linux program, in another Docker container, and I can converse with my LLM via Open WebUI in Firefox using a Localhost port. From then on, it’s just like posting right here in AskWoody and getting replies to my posts. In other words, a chatbot, only it’s completely local, running on a VHD and cut off from the internet.
Open WebUI keeps track of my conversations with the LLM, so I can go back to a previous conversation which allows the LLM to call up the particulars of that conversation through its algorithms and cache mechanisms.
The larger LLM presented a good semblance of reasoning; “However, we’re told that 5 of these kiwis are a bit smaller than average. This doesn’t affect the total count, though – they still count as kiwis!”
Was that more than you wanted to know, @wavy?
Always create a fresh drive image before making system changes/Windows updates; you may need to start over!We all have our own reasons for doing the things that we do with our systems; we don't need anyone's approval, and we don't all have to do the same things.We were all once "Average Users".1 user thanked author for this post.
-
shadonoz
AskWoody PlusDecember 31, 2024 at 6:29 pm #2730280“Machine learning is inherently approximate and always will be.”
Likewise, human thinking and comprehension.
1 user thanked author for this post.
-
Michael Covington
AskWoody_MVPDecember 31, 2024 at 9:56 pm #2730303So when you say 2 + 2 = 4 you are only being approximate?
Human reasoning is limited. But humans can perform explicit symbolic reasoning (such as arithmetic or highly explicit logic) that is not approximate. It is not the easiest kind of thinking, but it is something we do.
shadonoz
AskWoody PlusJanuary 1, 2025 at 4:36 pm #2730503So when you say 2 + 2 = 4 you are only being approximate?
Human reasoning is limited. But humans can perform explicit symbolic reasoning (such as arithmetic or highly explicit logic) that is not approximate. It is not the easiest kind of thinking, but it is something we do.
The example you offer is calculation, not reasoning, thinking nor comprehension. Machines can already do that. It’s not relevant to my comment.
I wasn’t disagreeing with you, but simply pointing out that there is nothing precise about human thinking. If it was, then perhaps machines WOULD eventually learn to “reason”. That’s what’s so magical and mysterious about human thought.
I hope and trust that your premise that machines will never be capable of reason as we know it. But that doesn’t mean that they will not gain capabilities that we cannot imagine at this time, just as they already have– capabilities that could be just as problematic or dangerous.
Thanks for the article and setting the record straight.
2 users thanked author for this post.
ibe98765
AskWoody PlusJanuary 5, 2025 at 10:07 pm #2731477Well yes, there has been plenty written on the limitations of LLM’s and how they process data.
The question I’m interested in is what is the next stage in the development of AI? What does it entail? How will it be different than LLM’s? Researchers are surely working on such technologies. How do we find out more about this? Anyone have any links on this?
Second, we are in the very early stages of developing AI but too many people seem to think that we are much further along that is realistic. We are at a Model T level technology with a long way to go until we get to automotive technology of the 1970’s, let alone 2020’s.
I think we first need to understand how our brains work better if we want to take a stab at making “intelligent” and perhaps even conscious AI’s in the same way our 3lb brains are.
I find it amazing that with hundreds of years of brain research, billions upon billions of dollars spent in this area and all the highly technical tools we have today, we STILL don’t know how data is stored or retrieved in the brain, cannot read such data like a computer memory and have no idea what consciousness is or where it comes from (the sperm, the egg, the union of the two?).
Viewing 7 reply threads -

Plus Membership
Donations from Plus members keep this site going. You can identify the people who support AskWoody by the Plus badge on their avatars.
AskWoody Plus members not only get access to all of the contents of this site -- including Susan Bradley's frequently updated Patch Watch listing -- they also receive weekly AskWoody Plus Newsletters (formerly Windows Secrets Newsletter) and AskWoody Plus Alerts, emails when there are important breaking developments.
Get Plus!
Welcome to our unique respite from the madness.
It's easy to post questions about Windows 11, Windows 10, Win8.1, Win7, Surface, Office, or browse through our Forums. Post anonymously or register for greater privileges. Keep it civil, please: Decorous Lounge rules strictly enforced. Questions? Contact Customer Support.
Search Newsletters
Search Forums
View the Forum
Search for Topics
Recent Topics
-
Seeing BSOD’s on 24H2?
by
Susan Bradley
1 hour, 26 minutes ago -
TUT For Private Llama LLM, Local Installation and Isolated from the Internet.
by
bbearren
4 hours, 53 minutes ago -
Upgrade from Windows 10 to 11
by
Holdsworth8
7 hours, 16 minutes ago -
Microsoft : AI-powered deception: Emerging fraud threats and countermeasures
by
Alex5723
10 hours, 6 minutes ago -
0patch
by
WSjcgc50
6 hours, 34 minutes ago -
Devices might encounter blue screen exception with the recent Windows updates
by
Susan Bradley
3 hours, 28 minutes ago -
Windows 11 Insider Preview Build 22631.5261 (23H2) released to Release Preview
by
joep517
13 hours, 5 minutes ago -
Problem opening image attachments
by
RobertG
14 hours, 40 minutes ago -
advice for setting up a new windows computer
by
routtco1001
1 day, 5 hours ago -
It’s Identity Theft Day!
by
Susan Bradley
9 hours, 42 minutes ago -
Android 15 require minimum 32GB of storage
by
Alex5723
1 day, 10 hours ago -
Mac Mini 2018, iPhone 6s 2015 Are Now Vintage
by
Alex5723
1 day, 10 hours ago -
Hertz says hackers stole customer credit card and driver’s license data
by
Alex5723
1 day, 10 hours ago -
Firefox became sluggish
by
Rick Corbett
1 day, 7 hours ago -
Windows 10 Build 19045.5794 (22H2) to Release Preview Channel
by
joep517
1 day, 14 hours ago -
Windows 11 Insider Preview Build 22635.5235 (23H2) released to BETA
by
joep517
1 day, 15 hours ago -
A Funny Thing Happened on the Way to the Forum
by
bbearren
12 hours, 32 minutes ago -
Download speeds only 0.3Mbps after 24H2 upgrade on WiFi and Ethernet
by
John
7 hours, 6 minutes ago -
T-Mobile 5G Wireless Internet
by
WSmmi16
13 hours, 13 minutes ago -
Clock missing above calendar in Windows 10
by
WSCape Sand
14 hours, 25 minutes ago -
Formula to Calculate Q1, Q2, Q3, or Q4 of the Year?
by
WSJon5
2 days, 6 hours ago -
The time has come for AI-generated art
by
Catherine Barrett
1 day, 10 hours ago -
Hackers are using two-factor authentication to infect you
by
B. Livingston
1 day, 19 hours ago -
23 and you
by
Max Stul Oppenheimer
2 days, 3 hours ago -
April’s deluge of patches
by
Susan Bradley
6 hours, 58 minutes ago -
Windows 11 Windows Updater question
by
Tex265
7 hours, 20 minutes ago -
Key, Key, my kingdom for a Key!
by
RetiredGeek
3 days, 11 hours ago -
Registry Patches for Windows 10
by
Drcard:))
3 days, 16 hours ago -
Cannot get line length to NOT wrap in Outlining in Word 365
by
CWBillow
2 days, 23 hours ago -
DDU (Display Driver Uninstaller) updates
by
Alex5723
2 days, 8 hours ago
Recent blog posts
Key Links
Want to Advertise in the free newsletter? How about a gift subscription in honor of a birthday? Send an email to sb@askwoody.com to ask how.
Mastodon profile for DefConPatch
Mastodon profile for AskWoody
Home • About • FAQ • Posts & Privacy • Forums • My Account
Register • Free Newsletter • Plus Membership • Gift Certificates • MS-DEFCON Alerts
Copyright ©2004-2025 by AskWoody Tech LLC. All Rights Reserved.