Chong Wang, Jingwen Jiang, M. Daneva, M. V. Sinderen
{"title":"CoolTeD: A Web-based Collaborative Labeling Tool for the Textual Dataset","authors":"Chong Wang, Jingwen Jiang, M. Daneva, M. V. Sinderen","doi":"10.1109/saner53432.2022.00078","DOIUrl":null,"url":null,"abstract":"High-quality labeled textual data are vital for automatic mining and analysis of massive textual data produced by software systems. Several tools have been designed to facilitate manual labeling of textual data on different levels of granularity. However, these tools neither aim to provide statistics and analysis of labeled textual data, nor support collaboration among the coders to reduce the time cost in manual labeling and enhance the quality of labeling results. In this paper, we developed a Web-based labeling tool named CoolTeD (available at http://williamsriver.cn) for collaborative labeling of the textual datasets. Specifically, CoolTeD can be used: (1) to label textual data from the perspective of requirements types based on ISO 25010, (2) to review the labeling results with different confidence levels and contradictory labels, (3) to automatically calculate Cohen's Kappa coefficient of multiple coders, and (4) to visualize the labeling results. The tool demo is available at https://youtu.be/xVkrB_Cs1J8","PeriodicalId":437520,"journal":{"name":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/saner53432.2022.00078","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
High-quality labeled textual data are vital for automatic mining and analysis of massive textual data produced by software systems. Several tools have been designed to facilitate manual labeling of textual data on different levels of granularity. However, these tools neither aim to provide statistics and analysis of labeled textual data, nor support collaboration among the coders to reduce the time cost in manual labeling and enhance the quality of labeling results. In this paper, we developed a Web-based labeling tool named CoolTeD (available at http://williamsriver.cn) for collaborative labeling of the textual datasets. Specifically, CoolTeD can be used: (1) to label textual data from the perspective of requirements types based on ISO 25010, (2) to review the labeling results with different confidence levels and contradictory labels, (3) to automatically calculate Cohen's Kappa coefficient of multiple coders, and (4) to visualize the labeling results. The tool demo is available at https://youtu.be/xVkrB_Cs1J8