GPT-4o is surely impressive but it is not "essentially AGI" (I am enjoying creating these little puzzles though)
@conitzer This one still stumps it, even though a four year old can solve it. Or rather, doesn't 'stump' it. It's hard, but we must resist the temptation to use epistemic terms when talking about these models.
@conitzer (Same answer when I fixed the 'gives' to 'contains' in the question.)
@victorgijsbers @conitzer This is possibly the clearest illustration I've seen yet of the difference between "reasoning" and the kind of "what has been written about this" heuristics that LLMs do.
Someone needs to teach LLMs to recognize when logic or math needs to be applied, and then how to write and execute code for simple logic and math queries. (I'm thinking LLMs are basically all intuitive "right brain"; they just need an analytical "left brain".)
@victorgijsbers @conitzer I mean, this one actually seems pretty easy to me -- the biggest hurdle is putting together a training-set of {word problems converted into whatever specific language the LLM is going to use internally for doing logic}.
Maybe if you had a selection of a dozen or so sandboxed languages that it could use, there'd be enough examples out in the wild... but it still seems a very LLM-specific need (who writes programs to solve extremely simple logic problems?).