标记和学习:可视化机器学习分类器在数据标记过程中成功的可能性

Yunjia Sun, E. Lank, Michael A. Terry
{"title":"标记和学习:可视化机器学习分类器在数据标记过程中成功的可能性","authors":"Yunjia Sun, E. Lank, Michael A. Terry","doi":"10.1145/3025171.3025208","DOIUrl":null,"url":null,"abstract":"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.","PeriodicalId":166632,"journal":{"name":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling\",\"authors\":\"Yunjia Sun, E. Lank, Michael A. Terry\",\"doi\":\"10.1145/3025171.3025208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.\",\"PeriodicalId\":166632,\"journal\":{\"name\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3025171.3025208\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3025171.3025208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 30

摘要

虽然机器学习是分析和分类复杂现实世界数据集的强大工具,但将这项技术整合到他们的软件系统中仍然具有挑战性,特别是对于专业知识有限的开发人员。机器学习的第一步,数据标记,传统上被认为是构建机器学习分类器的繁琐且不可避免的任务。然而,在本文中,我们认为它也可以作为开发人员深入了解其数据集的第一次机会。通过标签和学习界面,我们探索了可视化策略,利用数据标记任务来增强开发人员对其数据集的了解,包括分类器的可能成功和分类器决策背后的基本原理。同时,我们展示了可视化还通过向用户展示他们对分类器性能的影响来改善用户的标签体验。我们评估了标签和学习中的可视化,并通过实验证明了它们对在数据标记过程中寻求评估机器学习效用的软件开发人员的价值。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling
While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信