Hongxuan Li, Bifan Wei, Jun Liu, Zhaotong Guo, Jingchao Qi, Bei Wu, Yong Liu, Yuanyuan Shi
{"title":"ToFM: Topic-specific Facet Mining by Facet Propagation within Clusters","authors":"Hongxuan Li, Bifan Wei, Jun Liu, Zhaotong Guo, Jingchao Qi, Bei Wu, Yong Liu, Yuanyuan Shi","doi":"10.1109/ICKG52313.2021.00060","DOIUrl":null,"url":null,"abstract":"Mining the facets of topics is an essential task for information retrieval, information extraction and knowledge base construction. For the topics in courses, there are three challenges: different topics have different facet, the labels of facets rarely appear in the topic description text and not all topics have enough textural information to mine facets. In this paper we propose a weakly-supervised algorithm for topic-specific facet mining (ToFM for short) based on our finding that similar topics in a cluster have similar facet sets. For example, topics Binary Search Tree, Suffix Tree and AVL tree in Tree cluster have example, insertion, deletion, traversal and other similar facets. ToFM first splits topics in a domain into several topic clusters based on the topic description text. Then ToFM extracts initial facet sets for all topics from the corresponding Wikipedia article pages. Finally, ToFM performs a normalized facet propagation within each topic cluster to acquire final facet sets of every topic. We evaluate the performance of ToFM on six real-world datasets and experimental results show that ToFM achieves better performance than the existing facet mining algorithms.","PeriodicalId":174126,"journal":{"name":"2021 IEEE International Conference on Big Knowledge (ICBK)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Big Knowledge (ICBK)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICKG52313.2021.00060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Mining the facets of topics is an essential task for information retrieval, information extraction and knowledge base construction. For the topics in courses, there are three challenges: different topics have different facet, the labels of facets rarely appear in the topic description text and not all topics have enough textural information to mine facets. In this paper we propose a weakly-supervised algorithm for topic-specific facet mining (ToFM for short) based on our finding that similar topics in a cluster have similar facet sets. For example, topics Binary Search Tree, Suffix Tree and AVL tree in Tree cluster have example, insertion, deletion, traversal and other similar facets. ToFM first splits topics in a domain into several topic clusters based on the topic description text. Then ToFM extracts initial facet sets for all topics from the corresponding Wikipedia article pages. Finally, ToFM performs a normalized facet propagation within each topic cluster to acquire final facet sets of every topic. We evaluate the performance of ToFM on six real-world datasets and experimental results show that ToFM achieves better performance than the existing facet mining algorithms.