{"title":"基于双曲关联匹配的神经关键词提取","authors":"M. Song, Yi Feng, L. Jing","doi":"10.48550/arXiv.2205.02047","DOIUrl":null,"url":null,"abstract":"Keyphrase extraction is a fundamental task in natural language processing that aims to extract a set of phrases with important information from a source document. Identifying important keyphrases is the central component of keyphrase extraction, and its main challenge is learning to represent information comprehensively and discriminate importance accurately. In this paper, to address the above issues, we design a new hyperbolic matching model (HyperMatch) to explore keyphrase extraction in hyperbolic space. Concretely, to represent information comprehensively, HyperMatch first takes advantage of the hidden representations in the middle layers of RoBERTa and integrates them as the word embeddings via an adaptive mixing layer to capture the hierarchical syntactic and semantic structures. Then, considering the latent structure information hidden in natural languages, HyperMatch embeds candidate phrases and documents in the same hyperbolic space via a hyperbolic phrase encoder and a hyperbolic document encoder. To discriminate importance accurately, HyperMatch estimates the importance of each candidate phrase by explicitly modeling the phrase-document relevance via the Poincaré distance and optimizes the whole model by minimizing the hyperbolic margin-based triplet loss. Extensive experiments are conducted on six benchmark datasets and demonstrate that HyperMatch outperforms the recent state-of-the-art baselines.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Hyperbolic Relevance Matching for Neural Keyphrase Extraction\",\"authors\":\"M. Song, Yi Feng, L. Jing\",\"doi\":\"10.48550/arXiv.2205.02047\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Keyphrase extraction is a fundamental task in natural language processing that aims to extract a set of phrases with important information from a source document. Identifying important keyphrases is the central component of keyphrase extraction, and its main challenge is learning to represent information comprehensively and discriminate importance accurately. In this paper, to address the above issues, we design a new hyperbolic matching model (HyperMatch) to explore keyphrase extraction in hyperbolic space. Concretely, to represent information comprehensively, HyperMatch first takes advantage of the hidden representations in the middle layers of RoBERTa and integrates them as the word embeddings via an adaptive mixing layer to capture the hierarchical syntactic and semantic structures. Then, considering the latent structure information hidden in natural languages, HyperMatch embeds candidate phrases and documents in the same hyperbolic space via a hyperbolic phrase encoder and a hyperbolic document encoder. To discriminate importance accurately, HyperMatch estimates the importance of each candidate phrase by explicitly modeling the phrase-document relevance via the Poincaré distance and optimizes the whole model by minimizing the hyperbolic margin-based triplet loss. Extensive experiments are conducted on six benchmark datasets and demonstrate that HyperMatch outperforms the recent state-of-the-art baselines.\",\"PeriodicalId\":382084,\"journal\":{\"name\":\"North American Chapter of the Association for Computational Linguistics\",\"volume\":\"2014 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"North American Chapter of the Association for Computational Linguistics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.48550/arXiv.2205.02047\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Chapter of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.02047","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hyperbolic Relevance Matching for Neural Keyphrase Extraction
Keyphrase extraction is a fundamental task in natural language processing that aims to extract a set of phrases with important information from a source document. Identifying important keyphrases is the central component of keyphrase extraction, and its main challenge is learning to represent information comprehensively and discriminate importance accurately. In this paper, to address the above issues, we design a new hyperbolic matching model (HyperMatch) to explore keyphrase extraction in hyperbolic space. Concretely, to represent information comprehensively, HyperMatch first takes advantage of the hidden representations in the middle layers of RoBERTa and integrates them as the word embeddings via an adaptive mixing layer to capture the hierarchical syntactic and semantic structures. Then, considering the latent structure information hidden in natural languages, HyperMatch embeds candidate phrases and documents in the same hyperbolic space via a hyperbolic phrase encoder and a hyperbolic document encoder. To discriminate importance accurately, HyperMatch estimates the importance of each candidate phrase by explicitly modeling the phrase-document relevance via the Poincaré distance and optimizes the whole model by minimizing the hyperbolic margin-based triplet loss. Extensive experiments are conducted on six benchmark datasets and demonstrate that HyperMatch outperforms the recent state-of-the-art baselines.