Yahui Liu, Zhenghua Li, Chen Gong, Shilin Zhou, Min Zhang
{"title":"精心标注数据中的标注错误检测:词性标注为例研究","authors":"Yahui Liu, Zhenghua Li, Chen Gong, Shilin Zhou, Min Zhang","doi":"10.1016/j.eswa.2025.128374","DOIUrl":null,"url":null,"abstract":"<div><div>The annotation error detection (AED) task aims to automatically identify annotation errors in a dataset, which is crucial for ensuring the reliability and effectiveness of expert and intelligent systems across diverse applications. Most previous works either employ synthesized data, or subset of crowdsourced datasets. In contrast, this work focuses on detecting errors in painstakingly annotated data, using part-of-speech (POS) tagging as a case study. We construct a high-quality Chinese AED dataset, named CTB7E, by manually re-annotating the test set of CTB7. Among 81,578 tags, we identify approximately 1,700 erroneous tags, resulting in a 2.1 % error rate. We for the first time apply Kullback-Leibler (KL) divergence to AED and propose two new metrics. We investigate a wide range of AED approaches on both CTB7E and a synthesized dataset, under both single-model and Monte Carlo dropout settings. The results and analyses reveal interesting insights. We will release our data and code at <span><span>https://github.com/yahui19960717/POS_AED.git</span><svg><path></path></svg></span> to facilitate further research and collaboration in this area.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"290 ","pages":"Article 128374"},"PeriodicalIF":7.5000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Annotation error detection in painstakingly annotated data: Part-of-speech tagging as a case study\",\"authors\":\"Yahui Liu, Zhenghua Li, Chen Gong, Shilin Zhou, Min Zhang\",\"doi\":\"10.1016/j.eswa.2025.128374\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The annotation error detection (AED) task aims to automatically identify annotation errors in a dataset, which is crucial for ensuring the reliability and effectiveness of expert and intelligent systems across diverse applications. Most previous works either employ synthesized data, or subset of crowdsourced datasets. In contrast, this work focuses on detecting errors in painstakingly annotated data, using part-of-speech (POS) tagging as a case study. We construct a high-quality Chinese AED dataset, named CTB7E, by manually re-annotating the test set of CTB7. Among 81,578 tags, we identify approximately 1,700 erroneous tags, resulting in a 2.1 % error rate. We for the first time apply Kullback-Leibler (KL) divergence to AED and propose two new metrics. We investigate a wide range of AED approaches on both CTB7E and a synthesized dataset, under both single-model and Monte Carlo dropout settings. The results and analyses reveal interesting insights. We will release our data and code at <span><span>https://github.com/yahui19960717/POS_AED.git</span><svg><path></path></svg></span> to facilitate further research and collaboration in this area.</div></div>\",\"PeriodicalId\":50461,\"journal\":{\"name\":\"Expert Systems with Applications\",\"volume\":\"290 \",\"pages\":\"Article 128374\"},\"PeriodicalIF\":7.5000,\"publicationDate\":\"2025-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Expert Systems with Applications\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0957417425019931\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Expert Systems with Applications","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0957417425019931","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Annotation error detection in painstakingly annotated data: Part-of-speech tagging as a case study
The annotation error detection (AED) task aims to automatically identify annotation errors in a dataset, which is crucial for ensuring the reliability and effectiveness of expert and intelligent systems across diverse applications. Most previous works either employ synthesized data, or subset of crowdsourced datasets. In contrast, this work focuses on detecting errors in painstakingly annotated data, using part-of-speech (POS) tagging as a case study. We construct a high-quality Chinese AED dataset, named CTB7E, by manually re-annotating the test set of CTB7. Among 81,578 tags, we identify approximately 1,700 erroneous tags, resulting in a 2.1 % error rate. We for the first time apply Kullback-Leibler (KL) divergence to AED and propose two new metrics. We investigate a wide range of AED approaches on both CTB7E and a synthesized dataset, under both single-model and Monte Carlo dropout settings. The results and analyses reveal interesting insights. We will release our data and code at https://github.com/yahui19960717/POS_AED.git to facilitate further research and collaboration in this area.
期刊介绍:
Expert Systems With Applications is an international journal dedicated to the exchange of information on expert and intelligent systems used globally in industry, government, and universities. The journal emphasizes original papers covering the design, development, testing, implementation, and management of these systems, offering practical guidelines. It spans various sectors such as finance, engineering, marketing, law, project management, information management, medicine, and more. The journal also welcomes papers on multi-agent systems, knowledge management, neural networks, knowledge discovery, data mining, and other related areas, excluding applications to military/defense systems.