{"title":"HCUKE: A Hierarchical Context-aware approach for Unsupervised Keyphrase Extraction","authors":"","doi":"10.1016/j.knosys.2024.112511","DOIUrl":null,"url":null,"abstract":"<div><p>Keyphrase Extraction (KE) aims to identify a concise set of words or phrases that effectively summarizes the core ideas of a document. Recent embedding-based models have achieved state-of-the-art performance by jointly modeling local and global contexts in Unsupervised Keyphrase Extraction (UKE). However, these models often ignore either sentence- or document-level contexts, leading directly to weak or incorrect global significance. Furthermore, they rely heavily on local significance, making them vulnerable to noisy data, particularly in long documents, resulting in unstable and suboptimal performance. Intuitively, hierarchical contexts enable a more accurate understanding of the candidates, thereby enhancing their global relevance. Inspired by this, we propose a novel Hierarchical Context-aware Unsupervised Keyphrase Extraction method called <strong>HCUKE</strong>. Specifically, HCUKE comprises three core modules: (i) a hierarchical context-based global significance measure module that incrementally learns global semantic information from a three-level hierarchical structure; (ii) a phrase-level local significance measure module that captures local semantic information by modeling the context interaction among candidates; and (iii) a candidate ranking module that integrates the measure scores with positional weights to compute a final ranking score. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms state-of-the-art baselines.</p></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124011456","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Keyphrase Extraction (KE) aims to identify a concise set of words or phrases that effectively summarizes the core ideas of a document. Recent embedding-based models have achieved state-of-the-art performance by jointly modeling local and global contexts in Unsupervised Keyphrase Extraction (UKE). However, these models often ignore either sentence- or document-level contexts, leading directly to weak or incorrect global significance. Furthermore, they rely heavily on local significance, making them vulnerable to noisy data, particularly in long documents, resulting in unstable and suboptimal performance. Intuitively, hierarchical contexts enable a more accurate understanding of the candidates, thereby enhancing their global relevance. Inspired by this, we propose a novel Hierarchical Context-aware Unsupervised Keyphrase Extraction method called HCUKE. Specifically, HCUKE comprises three core modules: (i) a hierarchical context-based global significance measure module that incrementally learns global semantic information from a three-level hierarchical structure; (ii) a phrase-level local significance measure module that captures local semantic information by modeling the context interaction among candidates; and (iii) a candidate ranking module that integrates the measure scores with positional weights to compute a final ranking score. Extensive experiments on three benchmark datasets demonstrate that the proposed method significantly outperforms state-of-the-art baselines.
期刊介绍:
Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.