{"title":"Predicting the emergence of disruptive technologies by comparing with references via soft prompt-aware shared BERT","authors":"Guoxiu He , Chenxi Lin , Jiayu Ren , Peichen Duan","doi":"10.1016/j.joi.2024.101596","DOIUrl":null,"url":null,"abstract":"<div><div>The exponential increase in the annual volume of publications places a significant challenge in assessing the disruptive potential of technologies in new papers. Prior approaches to identifying disruptive technologies based on the accumulation of paper citations are characterized by their limited prospective and time-consuming nature. Moreover, the total citation count fails to capture the intricate network of citations associated with the focal papers. Consequently, we advocate for the utilization of the disruption index instead of depending on citation counts. Particularly, we devise a novel neural network, called Soft Prompt-aware Shared BERT (<strong>SPS-BERT</strong>), to predict the potential technological disruption index of immediately published papers. It incorporates separate soft prompts to enable BERT examining comparative details within a paper's abstract and its references. Additionally, a tailored attention mechanism is employed to intensify the semantic comparison. Based on the enhanced representation derived from BERT, we utilize a linear layer to estimate potential disruption index. Experimental results demonstrate that SPS-BERT outperforms existing state-of-the-art methods in predicting five-year disruption index across the DBLP and PubMed datasets. Additionally, we conduct an evaluation of our model to predict the ten-year disruption index and five-year citation increments, demonstrating its robustness and scalability. Notably, our model's predictions of disruptive technologies, based on papers published in 2022, align with the expert assessments released by MIT, highlighting its practical significance. The code is available at <span><span>https://github.com/ECNU-Text-Computing/SPS-BERT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"18 4","pages":"Article 101596"},"PeriodicalIF":3.4000,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Informetrics","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1751157724001081","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
The exponential increase in the annual volume of publications places a significant challenge in assessing the disruptive potential of technologies in new papers. Prior approaches to identifying disruptive technologies based on the accumulation of paper citations are characterized by their limited prospective and time-consuming nature. Moreover, the total citation count fails to capture the intricate network of citations associated with the focal papers. Consequently, we advocate for the utilization of the disruption index instead of depending on citation counts. Particularly, we devise a novel neural network, called Soft Prompt-aware Shared BERT (SPS-BERT), to predict the potential technological disruption index of immediately published papers. It incorporates separate soft prompts to enable BERT examining comparative details within a paper's abstract and its references. Additionally, a tailored attention mechanism is employed to intensify the semantic comparison. Based on the enhanced representation derived from BERT, we utilize a linear layer to estimate potential disruption index. Experimental results demonstrate that SPS-BERT outperforms existing state-of-the-art methods in predicting five-year disruption index across the DBLP and PubMed datasets. Additionally, we conduct an evaluation of our model to predict the ten-year disruption index and five-year citation increments, demonstrating its robustness and scalability. Notably, our model's predictions of disruptive technologies, based on papers published in 2022, align with the expert assessments released by MIT, highlighting its practical significance. The code is available at https://github.com/ECNU-Text-Computing/SPS-BERT.
期刊介绍:
Journal of Informetrics (JOI) publishes rigorous high-quality research on quantitative aspects of information science. The main focus of the journal is on topics in bibliometrics, scientometrics, webometrics, patentometrics, altmetrics and research evaluation. Contributions studying informetric problems using methods from other quantitative fields, such as mathematics, statistics, computer science, economics and econometrics, and network science, are especially encouraged. JOI publishes both theoretical and empirical work. In general, case studies, for instance a bibliometric analysis focusing on a specific research field or a specific country, are not considered suitable for publication in JOI, unless they contain innovative methodological elements.