README.md

MC-LARC

From Generation to Selection: Findings Converting Analogical Problem-Solving into Multiple-Choice Questions

Donghyeon Shin* · Seungpil Lee* · Klea Lena Kovačec · Sundong Kim†
EMNLP Findings 2024

Project Page Paper Dataset

This page is the official github repository for “From Generation to Selection: Findings Converting Analogical Problem-Solving into Multiple-Choice Questions”.

We introduce various versions of MC-LARC, which is available in here.

If you have any questions about our dataset, please contact us at shindong97411@gmail.com.

Main Results

MC-LARC’s Effect: MC-LARC, a multiple-choice version of ARC, increased LLM accuracy from 10% to 76%, focusing on “Understand” and “Apply” stages of Bloom’s Taxonomy, making it easier to assess low level of reasoning abilities.
LLM Shortcuts: LLMs often used shortcuts, like eliminating options based on repeated expressions or format, rather than actual reasoning, which inflated performance, especially without images.
Self-Feedback to Reduce Shortcuts: A self-feedback framework was introduced, where LLMs refined the options after solving the questions, reducing shortcut use and improving the reliability of the evaluation process.

Citation

If you find our paper helpful in your research, we would appreciate it if you could consider citing it.

@inproceedings{shin2024from,
  title={From Generation to Selection Findings of Converting Analogical Problem-Solving into Multiple-Choice Questions},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  author={Shin, Donghyeon and Lee, Seungpil and Kovacec, Klea and Kim, Sundong},
  year={2024}
}

Acknowledgement

This work was supported by the IITP (RS-2023-00216011, RS-2024-00445087, No. 2019-0-01842) and the NRF (RS-2024-00451162) grants funded by the Ministry of Science and ICT, Korea. Experiments were supported by the Accelerate Foundation Models Research Initiative, Microsoft.