标记和学习:可视化机器学习分类器在数据标记过程中成功的可能性

Proceedings of the 22nd International Conference on Intelligent User Interfaces Pub Date : 2017-03-07 DOI:10.1145/3025171.3025208

Yunjia Sun, E. Lank, Michael A. Terry

{"title":"标记和学习:可视化机器学习分类器在数据标记过程中成功的可能性","authors":"Yunjia Sun, E. Lank, Michael A. Terry","doi":"10.1145/3025171.3025208","DOIUrl":null,"url":null,"abstract":"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.","PeriodicalId":166632,"journal":{"name":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling\",\"authors\":\"Yunjia Sun, E. Lank, Michael A. Terry\",\"doi\":\"10.1145/3025171.3025208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.\",\"PeriodicalId\":166632,\"journal\":{\"name\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3025171.3025208\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3025171.3025208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 30

摘要

虽然机器学习是分析和分类复杂现实世界数据集的强大工具，但将这项技术整合到他们的软件系统中仍然具有挑战性，特别是对于专业知识有限的开发人员。机器学习的第一步，数据标记，传统上被认为是构建机器学习分类器的繁琐且不可避免的任务。然而，在本文中，我们认为它也可以作为开发人员深入了解其数据集的第一次机会。通过标签和学习界面，我们探索了可视化策略，利用数据标记任务来增强开发人员对其数据集的了解，包括分类器的可能成功和分类器决策背后的基本原理。同时，我们展示了可视化还通过向用户展示他们对分类器性能的影响来改善用户的标签体验。我们评估了标签和学习中的可视化，并通过实验证明了它们对在数据标记过程中寻求评估机器学习效用的软件开发人员的价值。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling

While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 22nd International Conference on Intelligent User Interfaces

自引率

0.00%

发文量