Predicting the emergence of disruptive technologies by comparing with references via soft prompt-aware shared BERT

IF 3.5 2区管理学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Journal of Informetrics Pub Date : 2024-10-16 DOI:10.1016/j.joi.2024.101596

Guoxiu He , Chenxi Lin , Jiayu Ren , Peichen Duan

{"title":"Predicting the emergence of disruptive technologies by comparing with references via soft prompt-aware shared BERT","authors":"Guoxiu He , Chenxi Lin , Jiayu Ren , Peichen Duan","doi":"10.1016/j.joi.2024.101596","DOIUrl":null,"url":null,"abstract":"<div><div>The exponential increase in the annual volume of publications places a significant challenge in assessing the disruptive potential of technologies in new papers. Prior approaches to identifying disruptive technologies based on the accumulation of paper citations are characterized by their limited prospective and time-consuming nature. Moreover, the total citation count fails to capture the intricate network of citations associated with the focal papers. Consequently, we advocate for the utilization of the disruption index instead of depending on citation counts. Particularly, we devise a novel neural network, called Soft Prompt-aware Shared BERT (<strong>SPS-BERT</strong>), to predict the potential technological disruption index of immediately published papers. It incorporates separate soft prompts to enable BERT examining comparative details within a paper's abstract and its references. Additionally, a tailored attention mechanism is employed to intensify the semantic comparison. Based on the enhanced representation derived from BERT, we utilize a linear layer to estimate potential disruption index. Experimental results demonstrate that SPS-BERT outperforms existing state-of-the-art methods in predicting five-year disruption index across the DBLP and PubMed datasets. Additionally, we conduct an evaluation of our model to predict the ten-year disruption index and five-year citation increments, demonstrating its robustness and scalability. Notably, our model's predictions of disruptive technologies, based on papers published in 2022, align with the expert assessments released by MIT, highlighting its practical significance. The code is available at <span><span>https://github.com/ECNU-Text-Computing/SPS-BERT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"18 4","pages":"Article 101596"},"PeriodicalIF":3.5000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Informetrics","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1751157724001081","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

The exponential increase in the annual volume of publications places a significant challenge in assessing the disruptive potential of technologies in new papers. Prior approaches to identifying disruptive technologies based on the accumulation of paper citations are characterized by their limited prospective and time-consuming nature. Moreover, the total citation count fails to capture the intricate network of citations associated with the focal papers. Consequently, we advocate for the utilization of the disruption index instead of depending on citation counts. Particularly, we devise a novel neural network, called Soft Prompt-aware Shared BERT (SPS-BERT), to predict the potential technological disruption index of immediately published papers. It incorporates separate soft prompts to enable BERT examining comparative details within a paper's abstract and its references. Additionally, a tailored attention mechanism is employed to intensify the semantic comparison. Based on the enhanced representation derived from BERT, we utilize a linear layer to estimate potential disruption index. Experimental results demonstrate that SPS-BERT outperforms existing state-of-the-art methods in predicting five-year disruption index across the DBLP and PubMed datasets. Additionally, we conduct an evaluation of our model to predict the ten-year disruption index and five-year citation increments, demonstrating its robustness and scalability. Notably, our model's predictions of disruptive technologies, based on papers published in 2022, align with the expert assessments released by MIT, highlighting its practical significance. The code is available at https://github.com/ECNU-Text-Computing/SPS-BERT.

查看原文本刊更多论文

通过软提示感知共享 BERT 与参考资料进行比较，预测颠覆性技术的出现

每年发表的论文数量呈指数级增长，这给评估新论文中技术的颠覆性潜力带来了巨大挑战。之前基于论文引用积累来识别颠覆性技术的方法具有前瞻性有限和耗时长的特点。此外，总引用次数无法捕捉到与焦点论文相关的错综复杂的引用网络。因此，我们主张使用干扰指数，而不是依赖引用次数。特别是，我们设计了一种名为 "软提示感知共享 BERT（Soft Prompt-aware Shared BERT，SPS-BERT）"的新型神经网络，用于预测即时发表论文的潜在技术中断指数。它结合了单独的软提示，使 BERT 能够检查论文摘要及其参考文献中的比较细节。此外，还采用了量身定制的关注机制来加强语义比较。根据 BERT 得出的增强表示法，我们利用线性层来估计潜在的干扰指数。实验结果表明，SPS-BERT 在预测 DBLP 和 PubMed 数据集的五年中断指数方面优于现有的最先进方法。此外，我们还对预测十年中断指数和五年引文增量的模型进行了评估，证明了该模型的鲁棒性和可扩展性。值得注意的是，我们的模型基于2022年发表的论文对颠覆性技术的预测与麻省理工学院发布的专家评估结果一致，突出了其实际意义。代码可在 https://github.com/ECNU-Text-Computing/SPS-BERT 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Informetrics Social Sciences-Library and Information Sciences

CiteScore

6.40

自引率

16.20%

发文量

期刊介绍： Journal of Informetrics (JOI) publishes rigorous high-quality research on quantitative aspects of information science. The main focus of the journal is on topics in bibliometrics, scientometrics, webometrics, patentometrics, altmetrics and research evaluation. Contributions studying informetric problems using methods from other quantitative fields, such as mathematics, statistics, computer science, economics and econometrics, and network science, are especially encouraged. JOI publishes both theoretical and empirical work. In general, case studies, for instance a bibliometric analysis focusing on a specific research field or a specific country, are not considered suitable for publication in JOI, unless they contain innovative methodological elements.