Xiaohui Dong, Zhengluo Li, Haoming Su, Jixiang Xue, Xiaochao Dang
{"title":"Transformer-enhanced hierarchical encoding with multi-decoder for diversified MCQ distractor generation","authors":"Xiaohui Dong, Zhengluo Li, Haoming Su, Jixiang Xue, Xiaochao Dang","doi":"10.1007/s10462-025-11237-3","DOIUrl":null,"url":null,"abstract":"<div><p>The validity of multiple-choice questions (MCQs) in reading comprehension assessments relies heavily on the quality of the distractors. However, the manual design of these distractors is both time-consuming and costly, prompting researchers to turn to computer technology for the automatic generation of distractors. This task involves the process of taking a reading comprehension article, a question and its corresponding correct answer as input, with the goal of generating distractors that are related to the answer, semantically consistent with the question, and traceable within the article. Initially, heuristic rule-based approaches were employed, to generate only word-level or phrase-level distractors. Recent studies have shifted towards using sequence-to-sequence neural networks for sentence-level distractor generation. Despite these advancements, these methods face two key challenges: difficulty in capturing long-distance semantic relationships within the context, leading to overly general or context-independent distractors, and the tendency for the generated distractors to be semantically similar. To address these limitations, this paper proposes a Transformer-Enhanced Hierarchical Encoding with Multi-Decoder (THE-MD) network, composed of a hierarchical encoder and multiple decoders. Specifically, the encoder employs the Transformer architecture to encode the context and capture long-range semantic information, thereby generating more contextually relevant distractors. The decoder utilizes multiple decoding strategies and a dissimilarity loss function to collaboratively generate diverse distractors. The experimental results show that the THE-MD model outperforms existing baselines on both automatic and manual evaluation metrics. On the RACE and RACE++ datasets, the model increased the BLEU-4 scores to 7.45 and 10.60, and the ROUGE-L scores to 22.96 and 34.88, while also demonstrating excellent performance in fluency and coherence metrics. These improvements highlight their potential to enhance the generation of MCQ distractors in educational assessments.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 8","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2025-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-025-11237-3.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-025-11237-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
The validity of multiple-choice questions (MCQs) in reading comprehension assessments relies heavily on the quality of the distractors. However, the manual design of these distractors is both time-consuming and costly, prompting researchers to turn to computer technology for the automatic generation of distractors. This task involves the process of taking a reading comprehension article, a question and its corresponding correct answer as input, with the goal of generating distractors that are related to the answer, semantically consistent with the question, and traceable within the article. Initially, heuristic rule-based approaches were employed, to generate only word-level or phrase-level distractors. Recent studies have shifted towards using sequence-to-sequence neural networks for sentence-level distractor generation. Despite these advancements, these methods face two key challenges: difficulty in capturing long-distance semantic relationships within the context, leading to overly general or context-independent distractors, and the tendency for the generated distractors to be semantically similar. To address these limitations, this paper proposes a Transformer-Enhanced Hierarchical Encoding with Multi-Decoder (THE-MD) network, composed of a hierarchical encoder and multiple decoders. Specifically, the encoder employs the Transformer architecture to encode the context and capture long-range semantic information, thereby generating more contextually relevant distractors. The decoder utilizes multiple decoding strategies and a dissimilarity loss function to collaboratively generate diverse distractors. The experimental results show that the THE-MD model outperforms existing baselines on both automatic and manual evaluation metrics. On the RACE and RACE++ datasets, the model increased the BLEU-4 scores to 7.45 and 10.60, and the ROUGE-L scores to 22.96 and 34.88, while also demonstrating excellent performance in fluency and coherence metrics. These improvements highlight their potential to enhance the generation of MCQ distractors in educational assessments.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.