Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen
{"title":"A comprehensive comparative analysis of publication monopoly phenomenon in scientific journals","authors":"Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen","doi":"10.1016/j.joi.2024.101628","DOIUrl":"10.1016/j.joi.2024.101628","url":null,"abstract":"<div><div>The increasing number of academic practitioners has resulted in a significantly increased volume of scientific papers, attracting considerable interest among researchers examining this correlation. However, little research has been devoted to the phenomenon of scientists monopolizing authorship in academic journals. This study thus introduces the term Publication Monopoly (PM) to describe this effect. The study refers to the prolific authors as Monopoly Authors. In addition, it proposes a Monopoly Index to assess PM severity. For each journal, the Monopoly Contribution (MC) quantifies the impact of Monopoly Authors. Using the Open Academic Graph dataset, our analysis explores the prevalence of PM and the corresponding MC in selected journals and academic fields. The findings demonstrate a positive relationship between the number of articles published and the likelihood of PM occurrence in most journals. Furthermore, fields relying heavily on laboratory environments or specialized equipment are particularly susceptible to PM. Additionally, once a journal becomes entrenched in PM, it is challenging to alleviate this phenomenon over time. Our study of PM aimed to prompt academic practitioners to carefully consider the likelihood of acceptance in journals characterized by high PM levels. Moreover, the study encourages journals to reconsider their need to accept more articles from Monopoly Authors.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101628"},"PeriodicalIF":3.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging patent classification based on deep learning: The case study on smart cities and industrial Internet of Things","authors":"Munan Li , Liang Wang","doi":"10.1016/j.joi.2024.101616","DOIUrl":"10.1016/j.joi.2024.101616","url":null,"abstract":"<div><div>With the trends of technology convergence and technology interdisciplinarity, technology-field (TF) resolution and classification of patents have gradually been challenged. Whether for patent applicants or for patent examiners, more precisely labeling the TF for a certain patent is important for technological searches. However, determining the TF of a patent may be difficult and may even involve the strategic behavior of patenting, which can cause noise in patent classification systems (PCSs). In addition, some specific patents could contain more TFs than claimed or be assigned questionable IPC codes; subsequently, in a regular search for technology/patents, information could be missed. Considering the advantages of deep learning compared with traditional machine learning algorithms in areas such as natural language processing (NLP), text classification and text sentiment analysis, this paper investigates several popular deep learning models and proposes a large-scale multilabel regression (MLR) model to handle specific patent analyses under situations of small sample learning. To verify the proposed MLR model for patent classification, the case study on smart cities and industrial Internet of Things (IIoT) is conducted. The MLR experiments on the TF resolution of smart cities and IIoT have yielded moderate results compared with those of the latest patent classification studies, which also rely on deep learning and the large language models (LLMs), which include RCNN, Bi-LSTM, BERT and GPT-4 etc. Therefore, the proposed MLR model with a customized loss function could be moderately effective for patent classification within a specific technology theme, could have implications for patent classification and the TF resolution of patents, and could further enrich methodologies for patent mining and informetrics based on artificial intelligence (AI).</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101616"},"PeriodicalIF":3.4,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142747971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang
{"title":"Linkages among science, technology, and industry on the basis of main path analysis","authors":"Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang","doi":"10.1016/j.joi.2024.101617","DOIUrl":"10.1016/j.joi.2024.101617","url":null,"abstract":"<div><div>Compared to the science-technology linkages, the linkages among science, technology, and industry are largely under-studied. Therefore, this paper proposes a main path analysis based framework to discover the science-technology-industry linkages, in which scientific publications, patents, and products are viewed as respective proxies of scientific research, technological advance, and industrial development. To validate the feasibility and effectiveness of our framework, after the DrugBank dataset in pharmaceutical industry was downloaded in XML form on 1 November 2019, this dataset is further enriched, drug entity mentions are recognized from scholarly articles and patents, and several citation cycles are eliminated. The scientific publications span from 1871 to 2019, and patents from 1953 to 2019. There are 8,421, 5,590, and 2,136 article, patent, and drug nodes and 41,200 citations in the largest weakly connected component of the constructed heterogeneous citation network. From empirical analysis on the largest weakly connected component, main conclusions can be drawn as follows. (1) The discovered developmental trajectories indeed encode the interactions among science, technology, and industry. Science and technology not only interact with each other, but also jointly promote the development of the industry, and the industry, in turn, influences the advancement of science and technology. (2) The developmental modes in the pharmaceutical industry can be grouped into three categories: pushed by only science, pushed by only technology, and pushed by science and technology simultaneously. (3) The drugs bridge scientific research and technological advance, and thereby help enhance knowledge exchanges between science and technology and shorten the cycle of drug development. This study contributes to discovering the linkages among science, technology, and industry from the perspective of mutual citations among scholarly articles, patents, and products. However, a scientific verification of our framework in other industries apart from pharmaceutical industry still needs to be further investigated.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101617"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Citation counts and inclusion of references in seven free-access scholarly databases: A comparative analysis","authors":"Lorena Delgado-Quirós , José Luis Ortega","doi":"10.1016/j.joi.2024.101618","DOIUrl":"10.1016/j.joi.2024.101618","url":null,"abstract":"<div><div>The aim of this study is to examine disparities in citation counts amongst scholarly databases and the reasons that contribute to these differences. A random Crossref sample of >115k DOIs was selected and subsequently searched across six databases (Dimensions, Google Scholar, Microsoft Academic, Scilit, Semantic Scholar and The Lens). In July 2021, citation counts and lists of references were extracted from each database for comparative processing and analysis. The findings indicate that publications in Crossref-based databases (Crossref, Dimensions, Scilit and The Lens) have similar citation counts, while documents in search engines (Google Scholar, Microsoft Academic and Semantic Scholar) have a higher number of citations due to a greater coverage of publications, but also to the integration of web copies. Analysis of references has revealed that Scilit only extracts references with Digital Object Identifiers (DOI) and that Semantic Scholar causes significant problems when it adds references from external web versions. Ultimately, the study has shown that all the databases struggle to index references from books and book chapters, which may be attributable to certain academic publishers. The study concludes with a discussion of the potential effects on research evaluation that may arise from this lack of citations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101618"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding
{"title":"Gender differences in dropout rate: From field, career status, and generation perspectives","authors":"Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding","doi":"10.1016/j.joi.2024.101615","DOIUrl":"10.1016/j.joi.2024.101615","url":null,"abstract":"<div><div>The dropout of scholars poses risks by depleting valuable resources and hindering the scientific community. Knowledge gaps on this issue lack consistency across career statuses and overlook its dynamic nature. To address this gap, we analyzed the career trajectories of over 24 million scholars in 19 fields from the MAG dataset, examining dropout rates by field, career status, and generation. Firstly, we observed an unexpectedly high proportion of transients, comprising a growing proportion of newcomers and accounting for over 50% of publications in most soft sciences. This highlights the shortage of continuants, such as scholars with full careers, who contribute to scientific communities. Secondly, our exploration into gender-specific dropout rates revealed that women exhibit a significantly higher dropout rates within the first 20 years, covering career statuses including junior dropout, early-career dropout, and mid-career dropouts. Notably, early- and mid-career dropouts demonstrate the lowest and most stable dropout rates. These insights prompted the development of a gendered scientific career model that combines changes in scholar numbers and dropout rates across career statuses. Lastly, our generational analysis spanning four generations unveiled a diminishing gender gap in dropout rates. In hard sciences, women encounter initial career challenges, with the gender gap in dropout rates decreasing over time. In contrast, the gender gap in soft sciences persists longer. These findings hold consistent across six subfields, offering implications for field evaluation, gender disparities policies, and a deeper understanding of scholarly dropout across generations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101615"},"PeriodicalIF":3.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linlin Ren , Lei Guo , Hui Yu , Feng Guo , Xinhua Wang , Xiaohui Han
{"title":"Collaborating with top scientists may not improve paper novelty: A causal analysis based on the propensity score matching method","authors":"Linlin Ren , Lei Guo , Hui Yu , Feng Guo , Xinhua Wang , Xiaohui Han","doi":"10.1016/j.joi.2024.101609","DOIUrl":"10.1016/j.joi.2024.101609","url":null,"abstract":"<div><div>In previous collaboration studies, a majority of them concentrate on examining cooperation models, often overlooking the pivotal role played by a Top Scientist (TS) in scientific advancements. As far as my knowledge extends, only one relevant work delves into the correlation between innovation and collaboration with TSs, and no research has explored this relationship from a causal perspective. More precisely, previous studies suffer from several limitations in their examination of this topic: 1) Existing studies on Papers' Novelty (PN) primarily focus on calculating methods, with limited exploration of its relationship with scientific cooperation. 2) Research that has explored the link between collaboration with TSs and output innovation often adopts a correlational perspective, lacking a causal analysis that could correct for potential confounding factors. 3) Previous methodologies overlook the attributes of citation networks as potential confounding factors, a crucial consideration in identifying identical papers in causal analyses. 4) The impact of disciplinary diversity of papers on the innovation output when collaborating with TSs is often overlooked in prior research. To address these limitations, we conduct a causal analysis of publications in three subfields of computer science from the Web of Science (WoS) database to demonstrate the impact of collaborating with TSs on PN. Specifically, to tackle Limitations 1) and 2), we employ PN as a metric to assess the quality of academic output and explore its causal relationship with collaborating with TSs using the Propensity Score Matching (PSM) method. To address Limitation 3), we comprehensively consider potential confounding factors influencing PSM matching by further incorporating the attributes of citation networks, thereby minimizing selection bias. To deal with Limitation 4), we not only focus on the overall treatment effect but also delve into the treatment effect of intra-disciplinary and interdisciplinary collaboration modes. The research findings indicate that the papers collaborating with TSs exhibit lower PN compared to those without the participation of TSs. This suggests that collaboration with TSs may come at the cost of reduced novelty. This discovery prompts profound reflections on scientific collaboration, emphasizing the challenges and trade-offs that may exist in collaboration.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101609"},"PeriodicalIF":3.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Inter- and intra-domain knowledge flows: Examining their relationship with impact at the field level over time","authors":"Giovanni Abramo , Ciriaco Andrea D'Angelo","doi":"10.1016/j.joi.2024.101614","DOIUrl":"10.1016/j.joi.2024.101614","url":null,"abstract":"<div><div>Just as innovations often succeed in fields beyond their original domains, this study explores whether the same applies to scientific discoveries. We investigate the flow of knowledge across scientific disciplines by analyzing connections between 2015 cited publications indexed in the Web of Science and their citing counterparts. Specifically, we measure the rates of knowledge dissemination within and across different fields. Our study addresses key questions about disparities between inter- and intra-domain dissemination rates, the relationship between dissemination types and scholarly impact, and the evolution of these patterns over time. These findings enhance our understanding of knowledge flows and provide practical insights with significant implications for evaluative bibliometrics.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101614"},"PeriodicalIF":3.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Scientific knowledge role transition prediction from a knowledge hierarchical structure perspective","authors":"Jinqing Yang , Jiming Hu","doi":"10.1016/j.joi.2024.101612","DOIUrl":"10.1016/j.joi.2024.101612","url":null,"abstract":"<div><div>There are several potential patterns in the evolution of scientific knowledge. In order to delve deeper into the changes in function and role during the evolution of knowledge, we have proposed a research framework that examines the transition of scientific knowledge roles from the perspective of a hierarchical structure. We constructed two classification models of transition possibility and transition type to predict whether one undergoes a role transition and which type of role transition it belongs to. Several datasets were constructed by utilizing the entire corpus of publications available in <em>PubMed</em> and the history records of <em>MeSH</em>. Among the tasks of transition type prediction and transition possibility prediction, the <em>Gradient Boosting</em> classifier performed the best. The binary classification model of transition possibility achieved a precision of 72.58 %, a recall of 71.04 %, and an F1 score of 71.78 %. The multi-classification model of transition possibility had a macro-F1 score of 61.29 %, a micro-F1 score of 84.07 %, and a weighted-F1 score of 82.90 %. Further, we found that the knowledge genealogy features contribute the most to the prediction of transition possibility while knowledge attribute and network structure features have a significantly greater influence on the prediction of transition type. Most features have an obvious effect on the role transition of the <strong><em>Content-change type</em></strong>, followed by <strong><em>Child-generation</em></strong> and <strong><em>Localization-shift types.</em></strong></div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101612"},"PeriodicalIF":3.4,"publicationDate":"2024-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dan Wang , Xiao Zhou , Pengwei Zhao , Juan Pang , Qiaoyang Ren
{"title":"Early identification of breakthrough technologies: Insights from science-driven innovations","authors":"Dan Wang , Xiao Zhou , Pengwei Zhao , Juan Pang , Qiaoyang Ren","doi":"10.1016/j.joi.2024.101606","DOIUrl":"10.1016/j.joi.2024.101606","url":null,"abstract":"<div><div>Identifying breakthrough technologies is crucial for advancing technological innovation and, in this sense, the innovation patterns driven by science are considered to be key pathways for forming breakthrough technologies. Building on this premise, this paper presents a framework for identifying breakthrough technologies that starts with these signals of scientific innovation. The first step in the method is to construct a science-technology knowledge network based on papers and patents. Then a two-stage selection method funnels the scientific innovation signals, filtering out those with the potential to trigger technological breakthroughs. Next, a machine learning-based link prediction model, integrating three types of features, identifies new links between science-driven signals and existing technologies. A community detection algorithm then identifies sub-networks of technologies formed around these new links. Finally, a structural entropy index is used to evaluate these sub-networks to determine potential breakthrough technologies. By systematically characterizing the content and core features of scientific innovation signals, this study reveals the driving sources of technological breakthroughs and sheds light on the absorption and diffusion processes of scientific innovation. We validated the method through a use case in the field of artificial intelligence. Those who manage technological innovation should find the insights of this research particularly valuable.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101606"},"PeriodicalIF":3.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantification and identification of authorial writing style through higher-order text network modeling and analysis","authors":"Hongzhong Deng, Chengxing Wu, Bingfeng Ge, Hongqian Wu","doi":"10.1016/j.joi.2024.101603","DOIUrl":"10.1016/j.joi.2024.101603","url":null,"abstract":"<div><div>Determining the true author of anonymized texts has important applications ranging from text classification and information extraction to forensic investigations. Despite substantial progress, current authorship identification solutions are limited to extracting straightforward semantic relationships in writing styles, lacking consideration for higher-order features among multiple vocabulary, phrases, or sentences in language structure. Here, we propose a novel approach based on hypernetwork theory to encode higher-order text features into a unified text hyper-network and investigate whether the hyper-order topological features of the text hyper-network contribute to revealing the author's stylistic preferences. Our results indicate that metrics of the text hyper-network, such as hyperdegree, average shortest path length, and intermittency, can capture more information about the author's writing styles. More importantly, in the author identification task of 170 novels, our method accurately distinguished the authorship of 81% of the novels, surpassing the accuracy of the method of using paired word relationships. This further highlights the importance of higher-order features in text analysis, beyond mere pairwise interactions of words.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101603"},"PeriodicalIF":3.4,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}