Abstract
The recent advent of large language models has reinvigorated debate over
whether human cognitive capacities might emerge in such generic models given
sufficient training data. Of particular interest is the ability of these models
to reason about novel problems zero-shot, without any direct training. In human
cognition, this capacity is closely tied to an ability to reason by analogy.
Here, we performed a direct comparison between human reasoners and a large
language model (the text-davinci-003 variant of GPT-3) on a range of analogical
tasks, including a non-visual matrix reasoning task based on the rule structure
of Raven's Standard Progressive Matrices. We found that GPT-3 displayed a
surprisingly strong capacity for abstract pattern induction, matching or even
surpassing human capabilities in most settings; preliminary tests of GPT-4
indicated even better performance. Our results indicate that large language
models such as GPT-3 have acquired an emergent ability to find zero-shot
solutions to a broad range of analogy problems.
Users
Please
log in to take part in the discussion (add own reviews or comments).