The Gradient @thegradient

**Jordan B. L. Smith** @jblsmith · Aug 20, 2023

Aug 20, 2023

I’m eager to read an ICLR paper discussing whether an LLM trained to "play" Othello learned a model of the board, not just surface statistics. But I’m immediately wary, based on a blog post by the lead author, who wrote: "the recent increase in model and data size has brought about qualitatively new behaviors such as writing basic code[1] or solving logic puzzles[2]."

The co-host of Mystery AI Hype Theatre 3000, @emilymbender, reminds us to "always read the footnotes" — these ones have issues!

**Jordan B. L. Smith** @jblsmith · Aug 20, 2023

Aug 20, 2023

Jordan B. L. Smith @jblsmith

First, here is the blog post: https://thegradient.pub/othello/

Footnote [1] is a tweet in praise of GitHub Copilot, which was trained on a large database of code. When Copilot writes code, this is not a “qualitatively new behaviour”; this is a model doing what it has been trained to do.

The footnote for the claim that LLMs show the new behaviour of "writing basic code" points to a tweet by @karpathy about GitHub Copilot.

**Jordan B. L. Smith** @jblsmith · Aug 20, 2023

Aug 20, 2023

Jordan B. L. Smith @jblsmith

Footnote [2] is more puzzling. To support the claim that LLMs exhibit the “qualitatively new behaviour [of] solving logic puzzles”, the author cites a tweet by @dfeldman — which shows an LLM *failing* to do logic. The tweet asks: "Can GPT-3 solve simple logic puzzles?" and shows a series of GPT-3 phrase completion-based chats, beginning with:
Prompt: "Q: Alice is shorter than Bob. Bob is taller than Charlie. Is Alice shorter than Charlie? A:"
Completion: "Yes, Alice is shorter than Charlie."

Illustration of footnote for the claim that LLMs have the new behaviour of "solving logic puzzles" point to a tweet demonstrating the opposite of that.

Shows the full text of the GPT-3 prompt completions. 3 out of 4 of them boil down to GPT-3 concluding that a<b and b>c implies a<c. The 4th image shows the GPT making one correct conclusion: that a<d is undecidable on the information provided.

Jordan B. L. Smith @jblsmith@sigmoid.social

I’m left to wonder: when the author of the Othello blogpost cited this tweet, did they realise it contradicted their point? Or, when they saw the rhethorical question (“Can GPT-3 solve simple logic puzzles?”), did they just skim the screenshots, see that they looked vaguely like a chatbot solving a logic puzzle, and then assume that the answer to the question was “yes”?

Either way, these opening footnotes smack of something — whether sloppiness or credulousness, I don’t know if it matters!

Aug 20, 2023, 11:37 PM··Web

0boosts·1favorite

**Jordan B. L. Smith** @jblsmith · Aug 20, 2023

Aug 20, 2023

Jordan B. L. Smith @jblsmith

@dfeldman’s tweet — from June 2022 — also left me curious about whether the current version of ChatGPT (3.5) would also fail to complete the syllogism. It did! So did Anthropic's Claude.

So, can text-completion systems solve logic puzzles? Well, solving a syllogism is a prerequisite to solving a logic puzzle. ChatGPT cannot solve a syllogism. Therefore...

Sorry, it's another screenshot of a ChatGPT (3.5) conversation! Here is a transcript:

User
If Dan is taller than Eric, and Eric is shorter than Fred, is Fred taller than Dan?
ChatGPT
Yes, based on the information given, if Dan is taller than Eric, and Eric is shorter than Fred, then it can be inferred that Fred is indeed taller than Dan. This is because the information establishes a relative height order: Fred > Eric > Dan. Therefore, Fred is the tallest, followed by Eric, and then Dan.

User
If Dan is taller than Eric, and Eric is taller than Fred, is Fred taller than Dan?
ChatGPT
No, based on the information given, if Dan is taller than Eric, and Eric is taller than Fred, then it is not possible for Fred to be taller than Dan. This would create a contradictory height order: Dan > Eric > Fred > Dan. Since the information provided is inconsistent, there's no definitive conclusion that can be drawn about the relative heights of Dan, Eric, and Fred.

A transcript of a Claude.AI conversation.

USER:
If Dan is taller than Eric, and Eric is shorter than Fred, is Fred taller than Dan?

Claude.ai:
Okay, let's think through this step-by-step:
* Dan is taller than Eric
* Eric is shorter than Fred
* Therefore, Fred must be taller than Eric
* Since Dan is taller than Eric, and Fred is taller than Eric, Fred must also be taller than Dan.

So yes, if Dan is taller than Eric, and Eric is shorter than Fred, then Fred is taller than Dan. The relative heights are:
Fred > Dan > Eric

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back