基于引用意图的异构超图学习用于文献检索

IF 3.5 3区 管理学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Kaiwen Shi, Kan Liu, Xinyan He
{"title":"基于引用意图的异构超图学习用于文献检索","authors":"Kaiwen Shi, Kan Liu, Xinyan He","doi":"10.1007/s11192-024-05066-4","DOIUrl":null,"url":null,"abstract":"<p>Literature retrieval helps scientists find previous work that is relative to their own research or even get new research ideas. However, the discrepancy between retrieval results and the ultimate intention of citation is neglected by most literature retrieval models. Citation intent refers to the researcher’s motivation for citing a paper. A citation intent graph with homogeneous nodes and heterogeneous hyperedges can represent different types of citation intents. By leveraging the citation intent information included in a hypergraph, a retrieval model can guide researchers on where to cite its retrieval result by understanding the citation behaviour in the graph. We present a ranking model called CitenGL (<b>Ci</b>tation In<b>ten</b>t <b>G</b>raph <b>L</b>earning) that aims to extract citation intent information and textual matching signals. The proposed model consists of a heterogeneous hypergraph encoder and a lightweight deep fusion unit for efficiency trade-offs. Compared to traditional literature retrieval, our model fills the gap between retrieval results and citation intention and yields an understandable graph-structured output. We evaluated our model on publicly available full-text paper datasets. Experimental results show that CitenGL outperforms most existing neural ranking models that only consider textual information, which illustrates the effectiveness of integrating citation intent information with textual information. Further ablation analyses show how citation intent information complements text-matching signals and citation networks.</p>","PeriodicalId":21755,"journal":{"name":"Scientometrics","volume":"44 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Heterogeneous hypergraph learning for literature retrieval based on citation intents\",\"authors\":\"Kaiwen Shi, Kan Liu, Xinyan He\",\"doi\":\"10.1007/s11192-024-05066-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>Literature retrieval helps scientists find previous work that is relative to their own research or even get new research ideas. However, the discrepancy between retrieval results and the ultimate intention of citation is neglected by most literature retrieval models. Citation intent refers to the researcher’s motivation for citing a paper. A citation intent graph with homogeneous nodes and heterogeneous hyperedges can represent different types of citation intents. By leveraging the citation intent information included in a hypergraph, a retrieval model can guide researchers on where to cite its retrieval result by understanding the citation behaviour in the graph. We present a ranking model called CitenGL (<b>Ci</b>tation In<b>ten</b>t <b>G</b>raph <b>L</b>earning) that aims to extract citation intent information and textual matching signals. The proposed model consists of a heterogeneous hypergraph encoder and a lightweight deep fusion unit for efficiency trade-offs. Compared to traditional literature retrieval, our model fills the gap between retrieval results and citation intention and yields an understandable graph-structured output. We evaluated our model on publicly available full-text paper datasets. Experimental results show that CitenGL outperforms most existing neural ranking models that only consider textual information, which illustrates the effectiveness of integrating citation intent information with textual information. Further ablation analyses show how citation intent information complements text-matching signals and citation networks.</p>\",\"PeriodicalId\":21755,\"journal\":{\"name\":\"Scientometrics\",\"volume\":\"44 1\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-06-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientometrics\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://doi.org/10.1007/s11192-024-05066-4\",\"RegionNum\":3,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientometrics","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1007/s11192-024-05066-4","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

文献检索可以帮助科学家找到与自己研究相关的前人工作,甚至获得新的研究思路。然而,大多数文献检索模型都忽略了检索结果与最终引用意图之间的差异。引用意图是指研究人员引用论文的动机。具有同质节点和异质超边的引用意图图可以代表不同类型的引用意图。通过利用超图中的引用意图信息,检索模型可以通过了解图中的引用行为,指导研究人员将检索结果引用到何处。我们提出了一种名为 CitenGL(引文意图图学习)的排序模型,旨在提取引文意图信息和文本匹配信号。该模型由一个异构超图编码器和一个轻量级深度融合单元组成,以实现效率权衡。与传统的文献检索相比,我们的模型填补了检索结果与引文意图之间的空白,并产生了可理解的图结构输出。我们在公开的全文论文数据集上评估了我们的模型。实验结果表明,CitenGL 优于大多数只考虑文本信息的现有神经排名模型,这说明了将引文意图信息与文本信息相结合的有效性。进一步的消融分析表明了引文意图信息是如何对文本匹配信号和引文网络进行补充的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

Heterogeneous hypergraph learning for literature retrieval based on citation intents

Heterogeneous hypergraph learning for literature retrieval based on citation intents

Literature retrieval helps scientists find previous work that is relative to their own research or even get new research ideas. However, the discrepancy between retrieval results and the ultimate intention of citation is neglected by most literature retrieval models. Citation intent refers to the researcher’s motivation for citing a paper. A citation intent graph with homogeneous nodes and heterogeneous hyperedges can represent different types of citation intents. By leveraging the citation intent information included in a hypergraph, a retrieval model can guide researchers on where to cite its retrieval result by understanding the citation behaviour in the graph. We present a ranking model called CitenGL (Citation Intent Graph Learning) that aims to extract citation intent information and textual matching signals. The proposed model consists of a heterogeneous hypergraph encoder and a lightweight deep fusion unit for efficiency trade-offs. Compared to traditional literature retrieval, our model fills the gap between retrieval results and citation intention and yields an understandable graph-structured output. We evaluated our model on publicly available full-text paper datasets. Experimental results show that CitenGL outperforms most existing neural ranking models that only consider textual information, which illustrates the effectiveness of integrating citation intent information with textual information. Further ablation analyses show how citation intent information complements text-matching signals and citation networks.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Scientometrics
Scientometrics 管理科学-计算机:跨学科应用
CiteScore
7.20
自引率
17.90%
发文量
351
审稿时长
1.5 months
期刊介绍: Scientometrics aims at publishing original studies, short communications, preliminary reports, review papers, letters to the editor and book reviews on scientometrics. The topics covered are results of research concerned with the quantitative features and characteristics of science. Emphasis is placed on investigations in which the development and mechanism of science are studied by means of (statistical) mathematical methods. The Journal also provides the reader with important up-to-date information about international meetings and events in scientometrics and related fields. Appropriate bibliographic compilations are published as a separate section. Due to its fully interdisciplinary character, Scientometrics is indispensable to research workers and research administrators throughout the world. It provides valuable assistance to librarians and documentalists in central scientific agencies, ministries, research institutes and laboratories. Scientometrics includes the Journal of Research Communication Studies. Consequently its aims and scope cover that of the latter, namely, to bring the results of research investigations together in one place, in such a form that they will be of use not only to the investigators themselves but also to the entrepreneurs and research workers who form the object of these studies.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信