{"title":"评估鹦鹉和反社会说谎者(主题演讲)","authors":"T. Sakai","doi":"10.1145/3578337.3605144","DOIUrl":null,"url":null,"abstract":"This talk builds on my SWAN (Schematised Weighted Average Nugget) paper published in May 2023, which discusses a generic framework for auditing a given textual conversational system. The framework assumes that conversation sessions have already been sampled through either human-in-the-loop experiments or user simulation, and is designed to handle task-oriented and non-task oriented conversations seamlessly. The arxiv paper also discussed a schema containing twenty (+1) criteria for scoring nuggets (i.e., factual statements and dialogue acts within each turn of the conversations) either manually or (semi)automatically. By ''parrots,'' I am referring to the stochastics parrots of Professor Emily M. Bender et al., i.e., large language models. By ''sociopathic liars,'' I am referring to the same thing, as Professor Shannon Bowen of the University of South Carolina describes them as follows. ''Sociopathic liars are the most damaging types of liars because they lie on a routine basis without conscience and often without reason. Whereas pathetic liars lie to get along, and narcissistic liars prevaricate to cover their inaction, drama, or ineptitude, sociopaths lie simply because they feel like it. Lying is easy for them, and they lie with no conscience or remorse.\" I would like to primarily discuss how researchers might be able to prevent conversational systems from doing harm to users, to labellers, and to society, rather than how we might evaluate good things that the systems might bring just to privileged people. Furthermore, I would like to argue that ICTIR is a perfect place for such a discussion.","PeriodicalId":415621,"journal":{"name":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Evaluating Parrots and Sociopathic Liars (keynote)\",\"authors\":\"T. Sakai\",\"doi\":\"10.1145/3578337.3605144\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This talk builds on my SWAN (Schematised Weighted Average Nugget) paper published in May 2023, which discusses a generic framework for auditing a given textual conversational system. The framework assumes that conversation sessions have already been sampled through either human-in-the-loop experiments or user simulation, and is designed to handle task-oriented and non-task oriented conversations seamlessly. The arxiv paper also discussed a schema containing twenty (+1) criteria for scoring nuggets (i.e., factual statements and dialogue acts within each turn of the conversations) either manually or (semi)automatically. By ''parrots,'' I am referring to the stochastics parrots of Professor Emily M. Bender et al., i.e., large language models. By ''sociopathic liars,'' I am referring to the same thing, as Professor Shannon Bowen of the University of South Carolina describes them as follows. ''Sociopathic liars are the most damaging types of liars because they lie on a routine basis without conscience and often without reason. Whereas pathetic liars lie to get along, and narcissistic liars prevaricate to cover their inaction, drama, or ineptitude, sociopaths lie simply because they feel like it. Lying is easy for them, and they lie with no conscience or remorse.\\\" I would like to primarily discuss how researchers might be able to prevent conversational systems from doing harm to users, to labellers, and to society, rather than how we might evaluate good things that the systems might bring just to privileged people. Furthermore, I would like to argue that ICTIR is a perfect place for such a discussion.\",\"PeriodicalId\":415621,\"journal\":{\"name\":\"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3578337.3605144\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3578337.3605144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
这次演讲建立在我于2023年5月发表的SWAN (Schematised Weighted Average Nugget)论文的基础上,该论文讨论了一个用于审计给定文本会话系统的通用框架。该框架假设会话会话已经通过人在循环实验或用户模拟进行了采样,并被设计为无缝地处理面向任务和非面向任务的对话。arxiv论文还讨论了一个包含20个(+1)标准的模式,用于手动或(半)自动地为掘金评分(即,每个对话回合中的事实陈述和对话行为)。通过“鹦鹉”,我指的是Emily M. Bender教授等人的随机鹦鹉,即大型语言模型。所谓“反社会的说谎者”,我指的是同一件事,正如南卡罗来纳大学的香农·鲍恩教授对他们的描述如下。“反社会的说谎者是最具破坏性的说谎者,因为他们在没有良心的情况下经常毫无理由地撒谎。可悲的说谎者撒谎是为了与人相处,自恋的说谎者推诿是为了掩盖自己的不作为、戏剧性或无能,而反社会者撒谎只是因为他们喜欢这样。撒谎对他们来说很容易,他们撒谎没有良心,也没有悔恨。”我想主要讨论研究人员如何能够防止对话系统对用户、标签者和社会造成伤害,而不是我们如何评估系统可能只给特权阶层带来的好处。此外,我想指出,ICTIR是进行此类讨论的完美场所。
Evaluating Parrots and Sociopathic Liars (keynote)
This talk builds on my SWAN (Schematised Weighted Average Nugget) paper published in May 2023, which discusses a generic framework for auditing a given textual conversational system. The framework assumes that conversation sessions have already been sampled through either human-in-the-loop experiments or user simulation, and is designed to handle task-oriented and non-task oriented conversations seamlessly. The arxiv paper also discussed a schema containing twenty (+1) criteria for scoring nuggets (i.e., factual statements and dialogue acts within each turn of the conversations) either manually or (semi)automatically. By ''parrots,'' I am referring to the stochastics parrots of Professor Emily M. Bender et al., i.e., large language models. By ''sociopathic liars,'' I am referring to the same thing, as Professor Shannon Bowen of the University of South Carolina describes them as follows. ''Sociopathic liars are the most damaging types of liars because they lie on a routine basis without conscience and often without reason. Whereas pathetic liars lie to get along, and narcissistic liars prevaricate to cover their inaction, drama, or ineptitude, sociopaths lie simply because they feel like it. Lying is easy for them, and they lie with no conscience or remorse." I would like to primarily discuss how researchers might be able to prevent conversational systems from doing harm to users, to labellers, and to society, rather than how we might evaluate good things that the systems might bring just to privileged people. Furthermore, I would like to argue that ICTIR is a perfect place for such a discussion.