README.md
MC-LARC
From Generation to Selection: Findings Converting Analogical Problem-Solving into Multiple-Choice Questions
Donghyeon Shin*
·
Seungpil Lee*
·
Klea Lena Kovačec
·
Sundong Kim†
EMNLP Findings 2024
This page is the official github repository for “From Generation to Selection: Findings Converting Analogical Problem-Solving into Multiple-Choice Questions”.
We introduce various versions of MC-LARC, which is available in here.
If you have any questions about our dataset, please contact us at shindong97411@gmail.com.
Main Results
MC-LARC’s Effect: MC-LARC, a multiple-choice version of ARC, increased LLM accuracy from 10% to 76%, focusing on “Understand” and “Apply” stages of Bloom’s Taxonomy, making it easier to assess low level of reasoning abilities.
LLM Shortcuts: LLMs often used shortcuts, like eliminating options based on repeated expressions or format, rather than actual reasoning, which inflated performance, especially without images.
Self-Feedback to Reduce Shortcuts: A self-feedback framework was introduced, where LLMs refined the options after solving the questions, reducing shortcut use and improving the reliability of the evaluation process.
Citation
If you find our paper helpful in your research, we would appreciate it if you could consider citing it.
@inproceedings{shin2024from,
title={From Generation to Selection Findings of Converting Analogical Problem-Solving into Multiple-Choice Questions},
booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
author={Shin, Donghyeon and Lee, Seungpil and Kovacec, Klea and Kim, Sundong},
year={2024}
}
Acknowledgement
This work was supported by the IITP (RS-2023-00216011, RS-2024-00445087, No. 2019-0-01842) and the NRF (RS-2024-00451162) grants funded by the Ministry of Science and ICT, Korea. Experiments were supported by the Accelerate Foundation Models Research Initiative, Microsoft.