聊天机器人和大型语言模型的调查：测试和评估技术

Natural Language Processing Journal Pub Date : 2025-01-25 DOI:10.1016/j.nlp.2025.100128

Sonali Uttam Singh, Akbar Siami Namin

{"title":"聊天机器人和大型语言模型的调查：测试和评估技术","authors":"Sonali Uttam Singh, Akbar Siami Namin","doi":"10.1016/j.nlp.2025.100128","DOIUrl":null,"url":null,"abstract":"<div><div>Chatbots have been quite developed in the recent decades and evolved along with the field of Artificial Intelligence (AI), enabling powerful capabilities in tasks such as text generation and summarization, sentiment analysis, and many other interesting Natural Language Processing (NLP) based tasks. Advancements in language models (LMs), specifically LLMs, have played an important role in improving the capabilities of chatbots. This survey paper provides a comprehensive overview in chatbot with the integration of LLMs, primarily focusing on the testing, evaluation and performance techniques and frameworks associated with it. The paper discusses the foundational concepts of chatbots and their evolution, highlights the challenges and opportunities they present by reviewing the state-of-the-art papers associated with the chatbots design, testing and evaluation. The survey also delves into the key components of chatbot systems, including Natural Language Understanding (NLU), dialogue management, and Natural Language Generation (NLG), and examine how LLMs have influenced each of these components. Furthermore, the survey examines the ethical considerations and limitations associated with LLMs. The paper primarily focuses on investigating the evaluation techniques and metrics used to assess the performance and effectiveness of these language models. This paper aims to provide an overview of chatbots and highlights the need for an appropriate framework in regards to testing and evaluating these chatbots and the LLMs associated with it in order to provide efficient and proper knowledge to user and potentially improve its quality based on advancements in the field of machine learning.</div></div>","PeriodicalId":100944,"journal":{"name":"Natural Language Processing Journal","volume":"10 ","pages":"Article 100128"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A survey on chatbots and large language models: Testing and evaluation techniques\",\"authors\":\"Sonali Uttam Singh, Akbar Siami Namin\",\"doi\":\"10.1016/j.nlp.2025.100128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Chatbots have been quite developed in the recent decades and evolved along with the field of Artificial Intelligence (AI), enabling powerful capabilities in tasks such as text generation and summarization, sentiment analysis, and many other interesting Natural Language Processing (NLP) based tasks. Advancements in language models (LMs), specifically LLMs, have played an important role in improving the capabilities of chatbots. This survey paper provides a comprehensive overview in chatbot with the integration of LLMs, primarily focusing on the testing, evaluation and performance techniques and frameworks associated with it. The paper discusses the foundational concepts of chatbots and their evolution, highlights the challenges and opportunities they present by reviewing the state-of-the-art papers associated with the chatbots design, testing and evaluation. The survey also delves into the key components of chatbot systems, including Natural Language Understanding (NLU), dialogue management, and Natural Language Generation (NLG), and examine how LLMs have influenced each of these components. Furthermore, the survey examines the ethical considerations and limitations associated with LLMs. The paper primarily focuses on investigating the evaluation techniques and metrics used to assess the performance and effectiveness of these language models. This paper aims to provide an overview of chatbots and highlights the need for an appropriate framework in regards to testing and evaluating these chatbots and the LLMs associated with it in order to provide efficient and proper knowledge to user and potentially improve its quality based on advancements in the field of machine learning.</div></div>\",\"PeriodicalId\":100944,\"journal\":{\"name\":\"Natural Language Processing Journal\",\"volume\":\"10 \",\"pages\":\"Article 100128\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Natural Language Processing Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2949719125000044\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Natural Language Processing Journal","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2949719125000044","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

聊天机器人在近几十年来得到了相当大的发展，并随着人工智能（AI）领域的发展而发展，在文本生成和摘要、情感分析以及许多其他有趣的基于自然语言处理（NLP）的任务中实现了强大的功能。语言模型（LMs）的进步，特别是llm，在提高聊天机器人的能力方面发挥了重要作用。这篇调查报告提供了一个全面的概述，在聊天机器人与法学硕士的集成，主要集中在测试，评估和性能技术和框架与它相关。本文讨论了聊天机器人的基本概念及其发展，通过回顾与聊天机器人设计、测试和评估相关的最新论文，突出了它们所带来的挑战和机遇。该调查还深入研究了聊天机器人系统的关键组成部分，包括自然语言理解（NLU）、对话管理和自然语言生成（NLG），并研究了法学硕士如何影响这些组成部分。此外，该调查还考察了与法学硕士相关的道德考虑和限制。本文主要侧重于研究用于评估这些语言模型的性能和有效性的评估技术和度量。本文旨在提供聊天机器人的概述，并强调需要一个适当的框架来测试和评估这些聊天机器人和与之相关的llm，以便为用户提供有效和适当的知识，并根据机器学习领域的进步潜在地提高其质量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A survey on chatbots and large language models: Testing and evaluation techniques

Chatbots have been quite developed in the recent decades and evolved along with the field of Artificial Intelligence (AI), enabling powerful capabilities in tasks such as text generation and summarization, sentiment analysis, and many other interesting Natural Language Processing (NLP) based tasks. Advancements in language models (LMs), specifically LLMs, have played an important role in improving the capabilities of chatbots. This survey paper provides a comprehensive overview in chatbot with the integration of LLMs, primarily focusing on the testing, evaluation and performance techniques and frameworks associated with it. The paper discusses the foundational concepts of chatbots and their evolution, highlights the challenges and opportunities they present by reviewing the state-of-the-art papers associated with the chatbots design, testing and evaluation. The survey also delves into the key components of chatbot systems, including Natural Language Understanding (NLU), dialogue management, and Natural Language Generation (NLG), and examine how LLMs have influenced each of these components. Furthermore, the survey examines the ethical considerations and limitations associated with LLMs. The paper primarily focuses on investigating the evaluation techniques and metrics used to assess the performance and effectiveness of these language models. This paper aims to provide an overview of chatbots and highlights the need for an appropriate framework in regards to testing and evaluating these chatbots and the LLMs associated with it in order to provide efficient and proper knowledge to user and potentially improve its quality based on advancements in the field of machine learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Natural Language Processing Journal

自引率

0.00%

发文量