QSAnglyzer:问答系统评估的棱镜分析的可视化分析

N. Chen, Been Kim
{"title":"QSAnglyzer:问答系统评估的棱镜分析的可视化分析","authors":"N. Chen, Been Kim","doi":"10.1109/VAST.2017.8585733","DOIUrl":null,"url":null,"abstract":"Developing sophisticated artificial intelligence (AI) systems requires AI researchers to experiment with different designs and analyze results from evaluations (we refer this task as evaluation analysis). In this paper, we tackle the challenges of evaluation analysis in the domain of question-answering (QA) systems. Through in-depth studies with QA researchers, we identify tasks and goals of evaluation analysis and derive a set of design rationales, based on which we propose a novel approach termed prismatic analysis. Prismatic analysis examines data through multiple ways of categorization (referred as angles). Categories in each angle are measured by aggregate metrics to enable diverse comparison scenarios. To facilitate prismatic analysis of QA evaluations, we design and implement the Question Space Anglyzer (QSAnglyzer), a visual analytics (VA) tool. In QSAnglyzer, the high-dimensional space formed by questions is divided into categories based on several angles (e.g., topic and question type). Each category is aggregated by accuracy, the number of questions, and accuracy variance across evaluations. QSAnglyzer visualizes these angles so that QA researchers can examine and compare evaluations from various aspects both individually and collectively. Furthermore, QA researchers filter questions based on any angle by clicking to construct complex queries. We validate QSAnglyzer through controlled experiments and by expert reviews. The results indicate that when using QSAnglyzer, users perform analysis tasks faster $(p \\lt 0.01)$ and more accurately $(p \\lt 0.05)$, and are quick to gain new insight. We discuss how prismatic analysis and QSAnglyzer scaffold evaluation analysis, and provide directions for future research.","PeriodicalId":149607,"journal":{"name":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"QSAnglyzer: Visual Analytics for Prismatic Analysis of Question Answering System Evaluations\",\"authors\":\"N. Chen, Been Kim\",\"doi\":\"10.1109/VAST.2017.8585733\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Developing sophisticated artificial intelligence (AI) systems requires AI researchers to experiment with different designs and analyze results from evaluations (we refer this task as evaluation analysis). In this paper, we tackle the challenges of evaluation analysis in the domain of question-answering (QA) systems. Through in-depth studies with QA researchers, we identify tasks and goals of evaluation analysis and derive a set of design rationales, based on which we propose a novel approach termed prismatic analysis. Prismatic analysis examines data through multiple ways of categorization (referred as angles). Categories in each angle are measured by aggregate metrics to enable diverse comparison scenarios. To facilitate prismatic analysis of QA evaluations, we design and implement the Question Space Anglyzer (QSAnglyzer), a visual analytics (VA) tool. In QSAnglyzer, the high-dimensional space formed by questions is divided into categories based on several angles (e.g., topic and question type). Each category is aggregated by accuracy, the number of questions, and accuracy variance across evaluations. QSAnglyzer visualizes these angles so that QA researchers can examine and compare evaluations from various aspects both individually and collectively. Furthermore, QA researchers filter questions based on any angle by clicking to construct complex queries. We validate QSAnglyzer through controlled experiments and by expert reviews. The results indicate that when using QSAnglyzer, users perform analysis tasks faster $(p \\\\lt 0.01)$ and more accurately $(p \\\\lt 0.05)$, and are quick to gain new insight. We discuss how prismatic analysis and QSAnglyzer scaffold evaluation analysis, and provide directions for future research.\",\"PeriodicalId\":149607,\"journal\":{\"name\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VAST.2017.8585733\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Conference on Visual Analytics Science and Technology (VAST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VAST.2017.8585733","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

开发复杂的人工智能(AI)系统需要人工智能研究人员对不同的设计进行实验并分析评估结果(我们将此任务称为评估分析)。在本文中,我们解决了评价分析在问答系统领域的挑战。通过与QA研究人员的深入研究,我们确定了评估分析的任务和目标,并得出了一套设计原理,在此基础上,我们提出了一种称为棱镜分析的新方法。棱柱分析通过多种分类方式(称为角度)检查数据。每个角度的类别都通过聚合度量来度量,以支持不同的比较场景。为了便于对QA评估进行棱镜分析,我们设计并实现了问题空间分析器(QSAnglyzer),这是一种可视化分析(VA)工具。在QSAnglyzer中,问题构成的高维空间根据几个角度(如主题、问题类型)进行分类。每个类别都是根据准确性、问题数量和评估之间的准确性差异进行汇总的。QSAnglyzer将这些角度可视化,以便QA研究人员可以从个人和集体的各个方面检查和比较评估。此外,QA研究人员通过点击来构建复杂的查询,从任何角度过滤问题。我们通过控制实验和专家评审来验证QSAnglyzer。结果表明,当使用QSAnglyzer时,用户执行分析任务的速度更快$(p \lt 0.01)$和更准确$(p \lt 0.05)$,并迅速获得新的见解。讨论了柱形分析和QSAnglyzer支架评价分析的方法,并提出了今后的研究方向。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
QSAnglyzer: Visual Analytics for Prismatic Analysis of Question Answering System Evaluations
Developing sophisticated artificial intelligence (AI) systems requires AI researchers to experiment with different designs and analyze results from evaluations (we refer this task as evaluation analysis). In this paper, we tackle the challenges of evaluation analysis in the domain of question-answering (QA) systems. Through in-depth studies with QA researchers, we identify tasks and goals of evaluation analysis and derive a set of design rationales, based on which we propose a novel approach termed prismatic analysis. Prismatic analysis examines data through multiple ways of categorization (referred as angles). Categories in each angle are measured by aggregate metrics to enable diverse comparison scenarios. To facilitate prismatic analysis of QA evaluations, we design and implement the Question Space Anglyzer (QSAnglyzer), a visual analytics (VA) tool. In QSAnglyzer, the high-dimensional space formed by questions is divided into categories based on several angles (e.g., topic and question type). Each category is aggregated by accuracy, the number of questions, and accuracy variance across evaluations. QSAnglyzer visualizes these angles so that QA researchers can examine and compare evaluations from various aspects both individually and collectively. Furthermore, QA researchers filter questions based on any angle by clicking to construct complex queries. We validate QSAnglyzer through controlled experiments and by expert reviews. The results indicate that when using QSAnglyzer, users perform analysis tasks faster $(p \lt 0.01)$ and more accurately $(p \lt 0.05)$, and are quick to gain new insight. We discuss how prismatic analysis and QSAnglyzer scaffold evaluation analysis, and provide directions for future research.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信