{"title":"Data Summarization with Hierarchical Taxonomy","authors":"Xuliang Zhu","doi":"10.1145/3448016.3450578","DOIUrl":null,"url":null,"abstract":"Data summarization has wide applications in real world, e.g. attributes filter, image set labeling and personalized recommendation. In this work, we study a new problem HSD to summarize a dataset using k concepts in a hierarchical taxonomy. Different from the existed works of whole hierarchy summarization, we focus on the accurate coverage of the given query set Q. The objective is to cover more items in Q and less items not in Q. To tackle it, we first propose a dynamic programming based algorithm on the tree hierarchy, which is a simple instance of HSD problem. Furthermore, we propose a heuristic method to assign the vertex to one of its in-neighbors for HDAGs and apply the tree algorithm on it. The experimental results confirm the quality of our methods on both tree and HDAG datasets.","PeriodicalId":360379,"journal":{"name":"Proceedings of the 2021 International Conference on Management of Data","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2021 International Conference on Management of Data","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3448016.3450578","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data summarization has wide applications in real world, e.g. attributes filter, image set labeling and personalized recommendation. In this work, we study a new problem HSD to summarize a dataset using k concepts in a hierarchical taxonomy. Different from the existed works of whole hierarchy summarization, we focus on the accurate coverage of the given query set Q. The objective is to cover more items in Q and less items not in Q. To tackle it, we first propose a dynamic programming based algorithm on the tree hierarchy, which is a simple instance of HSD problem. Furthermore, we propose a heuristic method to assign the vertex to one of its in-neighbors for HDAGs and apply the tree algorithm on it. The experimental results confirm the quality of our methods on both tree and HDAG datasets.