A survey on chatbots and large language models: Testing and evaluation techniques

Sonali Uttam Singh, Akbar Siami Namin
{"title":"A survey on chatbots and large language models: Testing and evaluation techniques","authors":"Sonali Uttam Singh,&nbsp;Akbar Siami Namin","doi":"10.1016/j.nlp.2025.100128","DOIUrl":null,"url":null,"abstract":"<div><div>Chatbots have been quite developed in the recent decades and evolved along with the field of Artificial Intelligence (AI), enabling powerful capabilities in tasks such as text generation and summarization, sentiment analysis, and many other interesting Natural Language Processing (NLP) based tasks. Advancements in language models (LMs), specifically LLMs, have played an important role in improving the capabilities of chatbots. This survey paper provides a comprehensive overview in chatbot with the integration of LLMs, primarily focusing on the testing, evaluation and performance techniques and frameworks associated with it. The paper discusses the foundational concepts of chatbots and their evolution, highlights the challenges and opportunities they present by reviewing the state-of-the-art papers associated with the chatbots design, testing and evaluation. The survey also delves into the key components of chatbot systems, including Natural Language Understanding (NLU), dialogue management, and Natural Language Generation (NLG), and examine how LLMs have influenced each of these components. Furthermore, the survey examines the ethical considerations and limitations associated with LLMs. The paper primarily focuses on investigating the evaluation techniques and metrics used to assess the performance and effectiveness of these language models. This paper aims to provide an overview of chatbots and highlights the need for an appropriate framework in regards to testing and evaluating these chatbots and the LLMs associated with it in order to provide efficient and proper knowledge to user and potentially improve its quality based on advancements in the field of machine learning.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100128"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719125000044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Chatbots have been quite developed in the recent decades and evolved along with the field of Artificial Intelligence (AI), enabling powerful capabilities in tasks such as text generation and summarization, sentiment analysis, and many other interesting Natural Language Processing (NLP) based tasks. Advancements in language models (LMs), specifically LLMs, have played an important role in improving the capabilities of chatbots. This survey paper provides a comprehensive overview in chatbot with the integration of LLMs, primarily focusing on the testing, evaluation and performance techniques and frameworks associated with it. The paper discusses the foundational concepts of chatbots and their evolution, highlights the challenges and opportunities they present by reviewing the state-of-the-art papers associated with the chatbots design, testing and evaluation. The survey also delves into the key components of chatbot systems, including Natural Language Understanding (NLU), dialogue management, and Natural Language Generation (NLG), and examine how LLMs have influenced each of these components. Furthermore, the survey examines the ethical considerations and limitations associated with LLMs. The paper primarily focuses on investigating the evaluation techniques and metrics used to assess the performance and effectiveness of these language models. This paper aims to provide an overview of chatbots and highlights the need for an appropriate framework in regards to testing and evaluating these chatbots and the LLMs associated with it in order to provide efficient and proper knowledge to user and potentially improve its quality based on advancements in the field of machine learning.
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信