When a model gives you the right answer to a reasoning question, you can't tell whether it was via memorization or via reasoning.
A simple way to tell between the two is to tweak your question in a way that 1. changes the answer, 2. requires some reasoning to adapt to the change. If you still get the same answer as before... it was memorization.

Many people think "reasoning" is a category of tasks -- e.g. involving numbers, riddles, etc. It's not. It's an ability, underpinned by compositional generalization.
You can always solve "reasoning" tasks without reasoning. Just memorize -- either memorize the answer or memorize the general question/answer template.
You can always solve "reasoning" tasks without reasoning. Just memorize -- either memorize the answer or memorize the general question/answer template.
Arguably, you could say that having memorized the template (a program to generate the solution) and being able to reapply it in a new context *is* a form of reasoning. I basically agree. But it's only a weak form of adaptation to novelty (known unknowns).
Generated by Thread Navigator
Press ⌘ + S to quick-export
