Concurrent Linguistic Error Detection (CLED): A New Methodology for Error Detection in Large Language Models

IF 3.8 2区计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

IEEE Transactions on Computers Pub Date : 2025-09-01 DOI:10.1109/TC.2025.3603682

Jinhua Zhu;Javier Conde;Zhen Gao;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi

{"title":"Concurrent Linguistic Error Detection (CLED): A New Methodology for Error Detection in Large Language Models","authors":"Jinhua Zhu;Javier Conde;Zhen Gao;Pedro Reviriego;Shanshan Liu;Fabrizio Lombardi","doi":"10.1109/TC.2025.3603682","DOIUrl":null,"url":null,"abstract":"The utilization of Large Language Models (LLMs) requires dependable operation in the presence of errors in the hardware (caused by for example radiation) as this has become a pressing concern. At the same time, the scale and complexity of LLMs limit the overhead that can be added to detect errors. Therefore, there is a need for low-cost error detection schemes. Concurrent Error Detection (CED) uses the properties of a system to detect errors, so it is an appealing approach. In this paper, we present a new methodology and scheme for error detection in LLMs: Concurrent Linguistic Error Detection (CLED). Its main principle is that the output of LLMs should be valid and generate coherent text; therefore, when the text is not valid or differs significantly from the normal text, it is likely that there is an error. Hence, errors can potentially be detected by checking the linguistic features of the text generated by LLMs. This has the following main advantages: 1) low overhead as the checks are simple and 2) general applicability, so regardless of the LLM implementation details because the text correctness is not related to the LLM algorithms or implementations. The proposed CLED has been evaluated on two LLMs: T5 and OPUS-MT. The results show that with a 1% overhead, CLED can detect more than 87% of the errors, making it suitable to improve LLM dependability at low cost.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 11","pages":"3638-3651"},"PeriodicalIF":3.8000,"publicationDate":"2025-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11145323/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

The utilization of Large Language Models (LLMs) requires dependable operation in the presence of errors in the hardware (caused by for example radiation) as this has become a pressing concern. At the same time, the scale and complexity of LLMs limit the overhead that can be added to detect errors. Therefore, there is a need for low-cost error detection schemes. Concurrent Error Detection (CED) uses the properties of a system to detect errors, so it is an appealing approach. In this paper, we present a new methodology and scheme for error detection in LLMs: Concurrent Linguistic Error Detection (CLED). Its main principle is that the output of LLMs should be valid and generate coherent text; therefore, when the text is not valid or differs significantly from the normal text, it is likely that there is an error. Hence, errors can potentially be detected by checking the linguistic features of the text generated by LLMs. This has the following main advantages: 1) low overhead as the checks are simple and 2) general applicability, so regardless of the LLM implementation details because the text correctness is not related to the LLM algorithms or implementations. The proposed CLED has been evaluated on two LLMs: T5 and OPUS-MT. The results show that with a 1% overhead, CLED can detect more than 87% of the errors, making it suitable to improve LLM dependability at low cost.

查看原文本刊更多论文

并发语言错误检测：一种大型语言模型中错误检测的新方法

大型语言模型（llm）的使用需要在硬件存在错误（例如由辐射引起）的情况下可靠地运行，因为这已经成为一个紧迫的问题。同时，llm的规模和复杂性限制了为检测错误而增加的开销。因此，需要低成本的错误检测方案。并发错误检测（CED）使用系统的属性来检测错误，因此它是一种吸引人的方法。在本文中，我们提出了一种新的llm错误检测方法和方案：并发语言错误检测（ced）。其主要原则是llm的输出应该是有效的，并产生连贯的文本；因此，当文本无效或与正常文本明显不同时，很可能存在错误。因此，可以通过检查llm生成的文本的语言特征来潜在地检测错误。这有以下主要优点：1)开销低，因为检查很简单；2)普遍适用性，因此与LLM实现细节无关，因为文本正确性与LLM算法或实现无关。在T5和OPUS-MT两种llm上对拟议的cle进行了评估。结果表明，在1%的开销下，cle可以检测到87%以上的错误，这使得它适合以低成本提高LLM的可靠性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Computers 工程技术-工程：电子与电气

CiteScore

6.60

自引率

5.40%

发文量

199

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.