Attention Heads of Large Language Models: A Survey

id:: 2409.03752
Authors:: Zifan Zheng, Yezhaohui Wang, Yuxin Huang, Shichao Song, Mingchuan Yang, Bo Tang, Feiyu Xiong, Zhiyu Li
Published:: 2024-09-05
arXiv:: https://arxiv.org/abs/2409.03752
PDF:: https://arxiv.org/pdf/2409.03752
DOI:: N/A
Journal Reference:: N/A
Primary Category:: cs.CL
Categories:: cs.CL
Comment:: 29 pages, 11 figures, 4 tables, 5 equations
github_url:: _

abstract

Since the advent of ChatGPT, Large Language Models (LLMs) have excelled in various tasks but remain as black-box systems. Consequently, the reasoning bottlenecks of LLMs are mainly influenced by their internal architecture. As a result, many researchers have begun exploring the potential internal mechanisms of LLMs, with most studies focusing on attention heads. Our survey aims to shed light on the internal reasoning processes of LLMs by concentrating on the underlying mechanisms of attention heads. We first distill the human thought process into a four-stage framework: Knowledge Recalling, In-Context Identification, Latent Reasoning, and Expression Preparation. Using this framework, we systematically review existing research to identify and categorize the functions of specific attention heads. Furthermore, we summarize the experimental methodologies used to discover these special heads, dividing them into two categories: Modeling-Free methods and Modeling-Required methods. Also, we outline relevant evaluation methods and benchmarks. Finally, we discuss the limitations of current research and propose several potential future directions.

premise

outline

quotes

notes

summary

Brief Overview

The paper is a survey of research on the internal mechanisms of Large Language Models (LLMs), specifically focusing on attention heads. It proposes a four-stage framework for human thought processes (Knowledge Recalling, In-Context Identification, Latent Reasoning, and Expression Preparation) as an analogy for understanding LLM reasoning. The survey categorizes existing research on attention heads based on this framework, summarizes experimental methodologies (Modeling-Free and Modeling-Required), and discusses limitations and future directions in this field of research.

Key Points

The paper focuses on the latest research on attention heads in LLMs such as LLaMA and GPT, consolidating findings from numerous studies.
It introduces a novel four-stage framework for LLM reasoning based on human cognitive processes.
Attention heads are categorized based on their functions within the four-stage framework.
Experimental methodologies for discovering special attention heads are categorized into Modeling-Free and Modeling-Required methods.
The survey discusses limitations of current research, including the relative simplicity of tasks investigated and a lack of overarching frameworks for understanding the collaborative functioning of attention heads.
Future directions are proposed, including exploring mechanisms in more complex tasks, improving robustness against prompts, and developing new experimental methods.

Notable Quotes

None explicitly identified, but the overarching theme uses the four-stage framework as a significant analogy.

Primary Themes

Mechanistic Interpretability of LLMs: The central theme is understanding how LLMs work internally, specifically focusing on the role of attention heads.
Analogy to Human Cognition: The four-stage framework of human thought processes is used as a valuable lens for interpreting the functions of attention heads within LLMs.
Categorization and Classification: The survey organizes and categorizes both attention head functions and experimental methodologies for improved understanding and future research.
Limitations and Future Directions: The authors explicitly address the limitations of current research and offer suggestions for future investigations.