Combining Induction and Transduction for Abstract Reasoning

id:: 2411.02272
Authors:: Wen-Ding Li, Keya Hu, Carter Larsen, Yuqing Wu, Simon Alford, Caleb Woo, Spencer M. Dunn, Hao Tang, Michelangelo Naim, Dat Nguyen, Wei-Long Zheng, Zenna Tavares, Yewen Pu, Kevin Ellis
Published:: 2024-11-04
arXiv:: https://arxiv.org/abs/2411.02272
PDF:: https://arxiv.org/pdf/2411.02272
DOI:: N/A
Journal Reference:: N/A
Primary Category:: cs.LG
Categories:: cs.LG, cs.AI, cs.CL
Comment:: N/A
github_url:: _

abstract

When learning an input-output mapping from very few examples, is it better to first infer a latent function that explains the examples, or is it better to directly predict new test outputs, e.g. using a neural network? We study this question on ARC, a highly diverse dataset of abstract reasoning tasks. We train neural models for induction (inferring latent functions) and transduction (directly predicting the test output for a given test input). Our models are trained on synthetic data generated by prompting LLMs to produce Python code specifying a function to be inferred, plus a stochastic subroutine for generating inputs to that function. We find inductive and transductive models solve very different problems, despite training on the same problems, and despite sharing the same neural architecture.

premise

outline

quotes

notes

summary

Brief Overview

This paper explores the effectiveness of induction and transduction for few-shot learning in abstract reasoning tasks, using the Abstraction and Reasoning Corpus (ARC-AGI) as a benchmark. The authors synthesize a large dataset of problems and train neural networks for both inductive (inferring latent functions) and transductive (directly predicting outputs) approaches. They find that these methods are strongly complementary, and ensembling them achieves near human-level performance.

Key Points

Induction and transduction methods, despite sharing the same architecture and training data, solve different types of ARC-AGI problems.
Induction excels at precise computation and composing multiple concepts, while transduction performs better on fuzzier perceptual concepts.
Ensembling induction and transduction significantly improves performance, approaching human-level accuracy on ARC-AGI.
The performance of induction models scales with test-time compute.
A novel data generation pipeline synthesizes a large dataset of ARC-AGI-style problems, starting from a smaller set of manually-written programs and using LLMs for augmentation.
The proposed method’s performance saturates quickly when increasing the number of manually labelled seed programs, but scales efficiently with compute.

Notable Quotes

None explicitly provided in the excerpt.

Primary Themes

Few-shot learning: The core focus is on learning from a limited number of examples.
Inductive vs. transductive learning: The paper compares and contrasts these two distinct learning paradigms.
Program synthesis: The use of Python code to represent the latent functions connects the work to program synthesis research.
Ensemble methods: Combining the strengths of different learning approaches is shown to be beneficial.
Data generation: The creation of large, high-quality synthetic datasets is a significant contribution.