Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson
{"title":"探索产品适度的数据编程管道中的灵感集","authors":"Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson","doi":"10.18653/v1/2021.ecnlp-1.16","DOIUrl":null,"url":null,"abstract":"We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.","PeriodicalId":210217,"journal":{"name":"Proceedings of The 4th Workshop on e-Commerce and NLP","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation\",\"authors\":\"Justin Winkler, Simon Brugman, Bas van Berkel, M. Larson\",\"doi\":\"10.18653/v1/2021.ecnlp-1.16\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.\",\"PeriodicalId\":210217,\"journal\":{\"name\":\"Proceedings of The 4th Workshop on e-Commerce and NLP\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of The 4th Workshop on e-Commerce and NLP\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18653/v1/2021.ecnlp-1.16\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The 4th Workshop on e-Commerce and NLP","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/2021.ecnlp-1.16","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploring Inspiration Sets in a Data Programming Pipeline for Product Moderation
We carry out a case study on the use of data programming to create data to train classifiers used for product moderation on a large e-commerce platform. Data programming is a recently-introduced technique that uses human-defined rules to generate training data sets without tedious item-by-item hand labeling. Our study investigates methods for allowing product moderators to quickly modify the rules given their knowledge of the domain and, especially, of textual item descriptions. Our results show promise that moderators can use this approach to steer the training data, making possible fast and close control of classifiers that detect policy violations.