在Quora竞争的主题命名约定:预测适当的主题合并和赢得主题从数以百万计的主题对

Binny Mathew, S. Maity, Pawan Goyal, Animesh Mukherjee
{"title":"在Quora竞争的主题命名约定:预测适当的主题合并和赢得主题从数以百万计的主题对","authors":"Binny Mathew, S. Maity, Pawan Goyal, Animesh Mukherjee","doi":"10.1145/3371158.3371173","DOIUrl":null,"url":null,"abstract":"Quora is a popular Q&A site which provides users with the ability to tag questions with multiple relevant topics which helps to attract quality answers. These topics are not predefined but user-defined conventions and it is not so rare to have multiple such conventions present in the Quora ecosystem describing exactly the same concept. In almost all such cases, users (or Quora moderators) manually merge the topic pair into one of the either topics, thus selecting one of the competing conventions. An important application for the site therefore is to identify such competing conventions early enough that should merge in future. In this paper, we propose a two-step approach that uniquely combines the anomaly detection and the supervised classification frameworks to predict whether two topics from among millions of topic pairs are indeed competing conventions, and should merge, achieving an F-score of 0.711. We also develop a model to predict the direction of the topic merge, i.e., the winning convention, achieving an F-score of 0.898. Our system is also able to predict ~ 25% of the correct case of merges within the first month of the merge and ~ 40% of the cases within a year. This is an encouraging result since Quora users on average take 936 days to identify such a correct merge.","PeriodicalId":360747,"journal":{"name":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","volume":"138 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Competing Topic Naming Conventions in Quora: Predicting Appropriate Topic Merges and Winning Topics from Millions of Topic Pairs\",\"authors\":\"Binny Mathew, S. Maity, Pawan Goyal, Animesh Mukherjee\",\"doi\":\"10.1145/3371158.3371173\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Quora is a popular Q&A site which provides users with the ability to tag questions with multiple relevant topics which helps to attract quality answers. These topics are not predefined but user-defined conventions and it is not so rare to have multiple such conventions present in the Quora ecosystem describing exactly the same concept. In almost all such cases, users (or Quora moderators) manually merge the topic pair into one of the either topics, thus selecting one of the competing conventions. An important application for the site therefore is to identify such competing conventions early enough that should merge in future. In this paper, we propose a two-step approach that uniquely combines the anomaly detection and the supervised classification frameworks to predict whether two topics from among millions of topic pairs are indeed competing conventions, and should merge, achieving an F-score of 0.711. We also develop a model to predict the direction of the topic merge, i.e., the winning convention, achieving an F-score of 0.898. Our system is also able to predict ~ 25% of the correct case of merges within the first month of the merge and ~ 40% of the cases within a year. This is an encouraging result since Quora users on average take 936 days to identify such a correct merge.\",\"PeriodicalId\":360747,\"journal\":{\"name\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"volume\":\"138 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3371158.3371173\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 7th ACM IKDD CoDS and 25th COMAD","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3371158.3371173","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

Quora是一个受欢迎的问答网站,它为用户提供了用多个相关主题标记问题的能力,这有助于吸引高质量的答案。这些主题不是预定义的,而是用户自定义的约定,在Quora生态系统中出现多个这样的约定描述完全相同的概念并不罕见。在几乎所有这样的情况下,用户(或Quora版主)手动将主题对合并到其中一个主题中,从而选择一个相互竞争的约定。因此,该站点的一个重要应用是尽早识别这些将来应该合并的相互竞争的惯例。在本文中,我们提出了一种两步方法,该方法独特地结合了异常检测和监督分类框架,以预测数百万个主题对中的两个主题是否确实是竞争约定,并且应该合并,从而获得了0.711的f分。我们还开发了一个模型来预测主题合并的方向,即获胜的约定,获得了0.898的f分。我们的系统还能够在合并后的第一个月内预测25%的正确合并案例,在一年内预测40%的正确合并案例。这是一个令人鼓舞的结果,因为Quora用户平均需要936天才能识别出这样一个正确的合并。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Competing Topic Naming Conventions in Quora: Predicting Appropriate Topic Merges and Winning Topics from Millions of Topic Pairs
Quora is a popular Q&A site which provides users with the ability to tag questions with multiple relevant topics which helps to attract quality answers. These topics are not predefined but user-defined conventions and it is not so rare to have multiple such conventions present in the Quora ecosystem describing exactly the same concept. In almost all such cases, users (or Quora moderators) manually merge the topic pair into one of the either topics, thus selecting one of the competing conventions. An important application for the site therefore is to identify such competing conventions early enough that should merge in future. In this paper, we propose a two-step approach that uniquely combines the anomaly detection and the supervised classification frameworks to predict whether two topics from among millions of topic pairs are indeed competing conventions, and should merge, achieving an F-score of 0.711. We also develop a model to predict the direction of the topic merge, i.e., the winning convention, achieving an F-score of 0.898. Our system is also able to predict ~ 25% of the correct case of merges within the first month of the merge and ~ 40% of the cases within a year. This is an encouraging result since Quora users on average take 936 days to identify such a correct merge.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信