Communicating Natural Programs to Humans and Machines
- id:
2106.07824
- Authors:
Samuel Acquaviva, Yewen Pu, Marta Kryven, Theodoros Sechopoulos, Catherine Wong, Gabrielle E Ecanow, Maxwell Nye, Michael Henry Tessler, Joshua B. Tenenbaum
- Published:
2021-06-15
- arXiv:
- PDF:
- DOI:
N/A
- Journal Reference:
N/A
- Primary Category:
cs.AI
- Categories:
cs.AI
- Comment:
equal contributions: (author 1,2) and (author 3,4,5). 36th Conference on Neural Information Processing Systems (NeurIPS 2022) Track on Datasets and Benchmarks
- github_url:
_
abstract
The Abstraction and Reasoning Corpus (ARC) is a set of procedural tasks that tests an agent’s ability to flexibly solve novel problems. While most ARC tasks are easy for humans, they are challenging for state-of-the-art AI. What makes building intelligent systems that can generalize to novel situations such as ARC difficult? We posit that the answer might be found by studying the difference of emph{language}: While humans readily generate and interpret instructions in a general language, computer systems are shackled to a narrow domain-specific language that they can precisely execute. We present LARC, the textit{Language-complete ARC}: a collection of natural language descriptions by a group of human participants who instruct each other on how to solve ARC tasks using language alone, which contains successful instructions for 88% of the ARC tasks. We analyze the collected instructions as `natural programs’, finding that while they resemble computer programs, they are distinct in two ways: First, they contain a wide range of primitives; Second, they frequently leverage communicative strategies beyond directly executable codes. We demonstrate that these two distinctions prevent current program synthesis techniques from leveraging LARC to its full potential, and give concrete suggestions on how to build the next-generation program synthesizers.
premise
outline
quotes
notes
summary
1. Brief Overview
This paper introduces LARC (Language-complete Abstraction and Reasoning Corpus), a dataset augmenting the existing ARC (Abstraction and Reasoning Corpus) dataset with natural language instructions for solving procedural tasks. While ARC tasks are easy for humans, they pose significant challenges for state-of-the-art AI. The authors propose that studying how humans communicate procedural knowledge through natural language can provide valuable insights for building more robust AI systems. They analyze these instructions as “natural programs,” highlighting their differences from traditional computer programs in terms of primitives and communicative strategies. They then evaluate the effectiveness of current program synthesis techniques on LARC, demonstrating the challenges presented by the complexity of natural language instructions and suggesting directions for future research in program synthesis.
2. Key Points
LARC augments 88% of ARC tasks with natural language instructions created through a human communication game.
Human-generated instructions (“natural programs”) in LARC differ from computer programs in their use of a wider range of primitives and communicative strategies beyond directly executable code.
Current program synthesis techniques struggle to fully leverage LARC due to the complexity of natural programs.
LARC can be used to study how humans communicate and interpret procedures, and to inform the design of future program synthesizers.
A novel bandit algorithm was developed to efficiently collect the LARC dataset.
Experiments demonstrate that while language improves program synthesis, current methods significantly underperform compared to human capabilities.
3. Notable Quotes
No notable quotes were identified.
4. Primary Themes
Human-Computer Communication: The central theme explores the differences between how humans and machines communicate procedural instructions.
Program Synthesis: The paper investigates the limitations of existing program synthesis techniques and proposes new directions based on insights from LARC.
Cognitive Modeling: LARC serves as a window into human cognitive processes involved in solving procedural tasks and communicating instructions.
Benchmarking AI: LARC provides a challenging benchmark to evaluate the generalization capabilities of AI systems.