Yihao Li , Pan Liu , Haiyang Wang , Jie Chu , W. Eric Wong
{"title":"Evaluating large language models for software testing","authors":"Yihao Li , Pan Liu , Haiyang Wang , Jie Chu , W. Eric Wong","doi":"10.1016/j.csi.2024.103942","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models (LLMs) have demonstrated significant prowess in code analysis and natural language processing, making them highly valuable for software testing. This paper conducts a comprehensive evaluation of LLMs applied to software testing, with a particular emphasis on test case generation, error tracing, and bug localization across twelve open-source projects. The advantages and limitations, as well as recommendations associated with utilizing LLMs for these tasks, are delineated. Furthermore, we delve into the phenomenon of hallucination in LLMs, examining its impact on software testing processes and presenting solutions to mitigate its effects. The findings of this work contribute to a deeper understanding of integrating LLMs into software testing, providing insights that pave the way for enhanced effectiveness in the field.</div></div>","PeriodicalId":50635,"journal":{"name":"Computer Standards & Interfaces","volume":"93 ","pages":"Article 103942"},"PeriodicalIF":4.1000,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Standards & Interfaces","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920548924001119","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) have demonstrated significant prowess in code analysis and natural language processing, making them highly valuable for software testing. This paper conducts a comprehensive evaluation of LLMs applied to software testing, with a particular emphasis on test case generation, error tracing, and bug localization across twelve open-source projects. The advantages and limitations, as well as recommendations associated with utilizing LLMs for these tasks, are delineated. Furthermore, we delve into the phenomenon of hallucination in LLMs, examining its impact on software testing processes and presenting solutions to mitigate its effects. The findings of this work contribute to a deeper understanding of integrating LLMs into software testing, providing insights that pave the way for enhanced effectiveness in the field.
期刊介绍:
The quality of software, well-defined interfaces (hardware and software), the process of digitalisation, and accepted standards in these fields are essential for building and exploiting complex computing, communication, multimedia and measuring systems. Standards can simplify the design and construction of individual hardware and software components and help to ensure satisfactory interworking.
Computer Standards & Interfaces is an international journal dealing specifically with these topics.
The journal
• Provides information about activities and progress on the definition of computer standards, software quality, interfaces and methods, at national, European and international levels
• Publishes critical comments on standards and standards activities
• Disseminates user''s experiences and case studies in the application and exploitation of established or emerging standards, interfaces and methods
• Offers a forum for discussion on actual projects, standards, interfaces and methods by recognised experts
• Stimulates relevant research by providing a specialised refereed medium.