An ESTs detection research based on paper entity mapping: Combining scientific text modeling and neural prophet

IF 4.3 3区 材料科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Dejian Yu, Bo Xiang
{"title":"An ESTs detection research based on paper entity mapping: Combining scientific text modeling and neural prophet","authors":"Dejian Yu,&nbsp;Bo Xiang","doi":"10.1016/j.joi.2024.101551","DOIUrl":null,"url":null,"abstract":"<div><p>Existing studies on the detection of emerging scientific topics (ESTs) overemphasize the newness and neglect content innovation of knowledge. Moreover, they also ignore the lag existing in knowledge diffusion. In this paper, we propose a four-stage detection framework for ESTs that maps emerging attributes from paper entities to scientific topics. Empirical studies based on two significantly different disciplinary datasets, IS-LS, and AI, which contain 73,601 and 255,620 publications, respectively, are employed to validate our approach. First, we generate 29 and 47 candidate scientific topics based on topic modeling, respectively. Second, we represent the novelty of paper entities based on pre-trained language models, which is mapped to scientific topic entities along with knowledge distributions to obtain topic emerging attributes: topic novelty, relative share and growth. Third, we propose to predict future trends of these attributes with Neural Prophet, which outperforms four baseline models in <span><math><msup><mrow><mi>R</mi></mrow><mn>2</mn></msup></math></span>, <span><math><mrow><mi>M</mi><mi>A</mi><mi>E</mi></mrow></math></span> and <span><math><mrow><mi>R</mi><mi>M</mi><mi>S</mi><mi>E</mi></mrow></math></span>. Finally, combining future values of candidate scientific topics, they are grouped into 8 clusters containing two ESTs types through strategic market theory and clustering model. From the correlation and feature distribution analysis of emerging attributes, we discover the existence of resilience and scale advantage in the diffusion of scientific knowledge. There also exists significant uncertainty in previous citation-based scientific topic evaluation patterns caused by the complexity of citation behavior. Overall, this research enriches theoretical knowledge and detection frameworks of ESTs, and provides detailed insights into comprehensive assessment and dissemination of scientific topics.</p></div>","PeriodicalId":3,"journal":{"name":"ACS Applied Electronic Materials","volume":null,"pages":null},"PeriodicalIF":4.3000,"publicationDate":"2024-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Electronic Materials","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1751157724000646","RegionNum":3,"RegionCategory":"材料科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Existing studies on the detection of emerging scientific topics (ESTs) overemphasize the newness and neglect content innovation of knowledge. Moreover, they also ignore the lag existing in knowledge diffusion. In this paper, we propose a four-stage detection framework for ESTs that maps emerging attributes from paper entities to scientific topics. Empirical studies based on two significantly different disciplinary datasets, IS-LS, and AI, which contain 73,601 and 255,620 publications, respectively, are employed to validate our approach. First, we generate 29 and 47 candidate scientific topics based on topic modeling, respectively. Second, we represent the novelty of paper entities based on pre-trained language models, which is mapped to scientific topic entities along with knowledge distributions to obtain topic emerging attributes: topic novelty, relative share and growth. Third, we propose to predict future trends of these attributes with Neural Prophet, which outperforms four baseline models in R2, MAE and RMSE. Finally, combining future values of candidate scientific topics, they are grouped into 8 clusters containing two ESTs types through strategic market theory and clustering model. From the correlation and feature distribution analysis of emerging attributes, we discover the existence of resilience and scale advantage in the diffusion of scientific knowledge. There also exists significant uncertainty in previous citation-based scientific topic evaluation patterns caused by the complexity of citation behavior. Overall, this research enriches theoretical knowledge and detection frameworks of ESTs, and provides detailed insights into comprehensive assessment and dissemination of scientific topics.

基于论文实体映射的 EST 检测研究:科学文本建模与神经先知的结合
现有关于新兴科学课题(EST)检测的研究过于强调知识的新颖性,而忽视了知识内容的创新性。此外,它们还忽视了知识传播中存在的滞后性。在本文中,我们提出了一个四阶段 EST 检测框架,该框架将论文实体的新兴属性映射到科学主题。为了验证我们的方法,我们采用了基于 IS-LS 和 AI 这两个明显不同的学科数据集的实证研究,这两个数据集分别包含 73,601 篇和 255,620 篇论文。首先,我们基于主题建模分别生成了 29 个和 47 个候选科学主题。其次,我们基于预先训练好的语言模型来表示论文实体的新颖性,并将其与知识分布一起映射到科学主题实体上,从而得到主题的新兴属性:主题新颖性、相对份额和增长。第三,我们建议使用神经先知预测这些属性的未来趋势,该模型在 R2、MAE 和 RMSE 方面优于四个基线模型。最后,结合候选科学主题的未来价值,通过战略市场理论和聚类模型,将其分为包含两种 EST 类型的 8 个聚类。从新兴属性的相关性和特征分布分析中,我们发现科学知识的传播存在弹性和规模优势。同时,由于引文行为的复杂性,以往基于引文的科学主题评价模式也存在很大的不确定性。总之,本研究丰富了EST的理论知识和检测框架,为科学主题的综合评估和传播提供了详尽的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.20
自引率
4.30%
发文量
567
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信