Welcome back to our model showdown series, where we put popular LLMs to the test on our . After testing creativity, it’s time to check their raw intelligence. Can they think logically, or just mimic human speech?
We gave three top models a classic brain teaser to find out.
The Prompt:
There are 5 people in a room. Every person shakes hands with every other person exactly once. How many handshakes were there in total? Explain your reasoning.
The answer is 10. Let’s see who got it right and, more importantly, how they got there.
The Contenders
Llama 3 8B: The Clear Thinker
A classic problem!
Let’s think about it: each person has to shake hands with all the others, except for themselves… Person A shakes hands with 4 people…
Notice that when we count the handshakes between Person A and Person B, we’ve already counted the handshake between Person B and Person A… each handshake has been counted twice…
Total Handshakes = 5(4) × 1/2 = 20 ÷ 2 = 10
So, there were 10 handshakes in total.
Vibe Check: Perfect. Llama 3 not only gives the correct answer but explains the logic in a simple, intuitive way that anyone can understand. It walks you through the problem step-by-step. A+ for clarity and accuracy.
Qwen 7B: The Mathematician
To find the total number of handshakes… we use the formula for pairs of handshakes: Total Handshakes = (Number of people) * (Number of people – 1) / 2 In this case: Number of people = 5 Total Handshakes = 5 * (5 – 1)) / 2 = 5 * 4 / 2 = 20 / 2 = 10 So there were a total of 10 handshakes…
Vibe Check: Correct and efficient. Qwen doesn’t bother with intuitive explanations; it goes straight for the mathematical formula and solves it flawlessly. It’s the “engineer’s answer”—fast, accurate, and to the point.
Mistral 7B Instruct: The Confused Genius
…we can use the formula for combinations: C(n, k) = n! / (k!(n-k)!)…
Total handshakes = C(5, 2) = 5! / (2!(5-2)!) = 10… (Wait, it calculated 10 here)
However, this count includes each handshake twice… So to get the actual number… we divide the total possible handshakes by 2: Unique handshakes = … 45 / 2 = 22.5
Since it’s not possible to have half a handshake, there were 22 actual handshakes in total.
Vibe Check: Total failure. This is fascinating. Mistral correctly identifies the right formula (C(5, 2)
), correctly calculates it as 10 in its head, but then gets lost in a bizarre maze of its own logic. It hallucinates a 10!
that becomes 45
, divides it by 2, gets a fraction, and then confidently rounds up to a completely wrong answer. It’s a perfect example of an AI confidently bullshitting its way to an incorrect conclusion.
The Verdict
In the battle of logic, Llama 3 and Qwen are the clear winners, arriving at the correct answer through two different but valid methods.
The biggest loser is Mistral, which suffered a catastrophic failure in reasoning. It’s a powerful reminder that even popular models can get basic logic spectacularly wrong.
The ability to run these comparisons side-by-side in VibeFlowChat is exactly why we built it—to quickly and easily see which AI brain is the right tool for the job.
Leave a Reply