MC3:分类器建造成本最小化系统

Shay Gershtein, T. Milo, Gefen Morami, Slava Novgorodov
{"title":"MC3:分类器建造成本最小化系统","authors":"Shay Gershtein, T. Milo, Gefen Morami, Slava Novgorodov","doi":"10.1145/3318464.3384690","DOIUrl":null,"url":null,"abstract":"Search mechanisms over massive sets of items are the cornerstone of many modern applications, particularly in e-commerce websites. Consumers express in search queries a set of properties, and expect the system to retrieve qualifying items. A common difficulty, however, is that the information on whether or not an item satisfies the search criteria is sometimes not explicitly recorded in the repository. Instead, it may be considered as general knowledge or \"hidden\" in a picture/description, thereby leading to incomplete search results. To overcome these problems companies invest in building dedicated classifiers that determine whether an item satisfies the given search criteria. However, building classifiers typically incurs non-trivial costs due to the required volumes of high-quality labeled training data. In this demo, we introduce MC3, a real-time system that helps data analysts decide which classifiers to construct to minimize the costs of answering a set of search queries. MC3 is interactive and facilitates real-time analysis, by providing detailed classifiers impact information. We demonstrate the effectiveness of MC3 on real-world data and scenarios taken from a large e-commerce system, by interacting with the SIGMOD'20 audience members who act as analysts.","PeriodicalId":436122,"journal":{"name":"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data","volume":"64 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"MC3: A System for Minimization of Classifier Construction Cost\",\"authors\":\"Shay Gershtein, T. Milo, Gefen Morami, Slava Novgorodov\",\"doi\":\"10.1145/3318464.3384690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Search mechanisms over massive sets of items are the cornerstone of many modern applications, particularly in e-commerce websites. Consumers express in search queries a set of properties, and expect the system to retrieve qualifying items. A common difficulty, however, is that the information on whether or not an item satisfies the search criteria is sometimes not explicitly recorded in the repository. Instead, it may be considered as general knowledge or \\\"hidden\\\" in a picture/description, thereby leading to incomplete search results. To overcome these problems companies invest in building dedicated classifiers that determine whether an item satisfies the given search criteria. However, building classifiers typically incurs non-trivial costs due to the required volumes of high-quality labeled training data. In this demo, we introduce MC3, a real-time system that helps data analysts decide which classifiers to construct to minimize the costs of answering a set of search queries. MC3 is interactive and facilitates real-time analysis, by providing detailed classifiers impact information. We demonstrate the effectiveness of MC3 on real-world data and scenarios taken from a large e-commerce system, by interacting with the SIGMOD'20 audience members who act as analysts.\",\"PeriodicalId\":436122,\"journal\":{\"name\":\"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data\",\"volume\":\"64 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3318464.3384690\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3318464.3384690","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

对大量条目的搜索机制是许多现代应用程序的基础,特别是在电子商务网站中。消费者在搜索查询中表达一组属性,并期望系统检索符合条件的项。然而,一个常见的困难是,关于项目是否满足搜索条件的信息有时没有显式地记录在存储库中。相反,它可能被认为是一般知识或“隐藏”在图片/描述中,从而导致不完整的搜索结果。为了克服这些问题,公司投资建立专门的分类器来确定一个项目是否满足给定的搜索标准。然而,由于需要大量高质量的标记训练数据,构建分类器通常会产生可观的成本。在本演示中,我们将介绍MC3,这是一个实时系统,可帮助数据分析人员决定构建哪些分类器,以最大限度地减少回答一组搜索查询的成本。MC3是交互式的,通过提供详细的分类器影响信息,便于实时分析。我们通过与作为分析师的SIGMOD'20听众进行交互,展示了MC3对来自大型电子商务系统的真实数据和场景的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
MC3: A System for Minimization of Classifier Construction Cost
Search mechanisms over massive sets of items are the cornerstone of many modern applications, particularly in e-commerce websites. Consumers express in search queries a set of properties, and expect the system to retrieve qualifying items. A common difficulty, however, is that the information on whether or not an item satisfies the search criteria is sometimes not explicitly recorded in the repository. Instead, it may be considered as general knowledge or "hidden" in a picture/description, thereby leading to incomplete search results. To overcome these problems companies invest in building dedicated classifiers that determine whether an item satisfies the given search criteria. However, building classifiers typically incurs non-trivial costs due to the required volumes of high-quality labeled training data. In this demo, we introduce MC3, a real-time system that helps data analysts decide which classifiers to construct to minimize the costs of answering a set of search queries. MC3 is interactive and facilitates real-time analysis, by providing detailed classifiers impact information. We demonstrate the effectiveness of MC3 on real-world data and scenarios taken from a large e-commerce system, by interacting with the SIGMOD'20 audience members who act as analysts.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信