{"title":"标记和学习:可视化机器学习分类器在数据标记过程中成功的可能性","authors":"Yunjia Sun, E. Lank, Michael A. Terry","doi":"10.1145/3025171.3025208","DOIUrl":null,"url":null,"abstract":"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.","PeriodicalId":166632,"journal":{"name":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","volume":"96 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"30","resultStr":"{\"title\":\"Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling\",\"authors\":\"Yunjia Sun, E. Lank, Michael A. Terry\",\"doi\":\"10.1145/3025171.3025208\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.\",\"PeriodicalId\":166632,\"journal\":{\"name\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"volume\":\"96 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"30\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd International Conference on Intelligent User Interfaces\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3025171.3025208\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3025171.3025208","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Label-and-Learn: Visualizing the Likelihood of Machine Learning Classifier's Success During Data Labeling
While machine learning is a powerful tool for the analysis and classification of complex real-world datasets, it is still challenging, particularly for developers with limited expertise, to incorporate this technology into their software systems. The first step in machine learning, data labeling, is traditionally thought of as a tedious, unavoidable task in building a machine learning classifier. However, in this paper, we argue that it can also serve as the first opportunity for developers to gain insight into their dataset. Through a Label-and-Learn interface, we explore visualization strategies that leverage the data labeling task to enhance developers' knowledge about their dataset, including the likely success of the classifier and the rationale behind the classifier's decisions. At the same time, we show that the visualizations also improve users' labeling experience by showing them the impact they have made on classifier performance. We assess the visualizations in Label-and-Learn and experimentally demonstrate their value to software developers who seek to assess the utility of machine learning during the data labeling process.