Wolfgang G. Stock , Gerhard Reichmann , Christian Schlögl
{"title":"Investigating the research output of institutions","authors":"Wolfgang G. Stock , Gerhard Reichmann , Christian Schlögl","doi":"10.1016/j.joi.2025.101638","DOIUrl":"10.1016/j.joi.2025.101638","url":null,"abstract":"<div><div>Describing, analyzing, and evaluating research institutions are among the main tasks of scientometrics and research evaluation. But how can we optimally search for an institution's research output? Possible search arguments include institution names, affiliations, addresses, and affiliated authors’ names. Prerequisites of these search tasks are complete lists (or at least good approximations) of the institutions’ publications, and—in later steps—their citations, and topics. When searching for the publications of research institutions in an information service, there are two options, namely (1) searching directly for the name of the institution and (2) searching for all authors affiliated with the institution in a defined time interval. Which strategy is more effective? More specifically, do informetric indicators such as recall and precision, search recall and search precision, and relative visibility change depending on the search strategy? What are the reasons for differences? To illustrate our approach, we conducted an illustrative study on two information science institutions and identified all staff members. The search was performed using the Web of Science Core Collection (WoS CC). As a performance indicator, applying fractional counting and considering co-affiliations of authors, we used the institution's relative visibility in an information service. We also calculated two variants of recall and precision at the institution level, namely search recall and search precision as informetric measures of performance differences between different search strategies (here: author search versus institution search) on the same information service (here: WoS CC) and recall and precision in relation to the complete set of an institution's publications. For all our calculations, there is a clear result: Searches for affiliated authors outperform searches for institutions in WoS. However, especially for large institutions it is difficult to determine all the staff members in the time interval of research. Additionally, information services (including WoS) are incomplete and there are variants for the names of institutions in the services. Therefore, searching for institutions and the publication-based quantitative evaluation of institutions are very critical issues.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101638"},"PeriodicalIF":3.4,"publicationDate":"2025-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distinguishing articles in questionable and non-questionable psychology journals using quantitative indicators associated with quality","authors":"Dimity Stephen","doi":"10.1016/j.joi.2025.101640","DOIUrl":"10.1016/j.joi.2025.101640","url":null,"abstract":"<div><div>This study investigates the viability of distinguishing articles in questionable journals (QJs) from those in non-QJs on the basis of quantitative indicators typically associated with quality. Subsequently, I examine what can be deduced about the quality of articles in QJs based on the differences observed. The samples comprise 1,714 articles from 31 QJs, 1,691 articles from 16 journals indexed in Web of Science (WoS), and 1,900 articles from 45 mid-tier journals, all in the field of psychology. I contrast between samples the length of abstracts and full-texts, prevalence of spelling errors, text readability, number of references and citations, the size and internationality of the author team, the documentation of ethics and informed consent statements, and the presence of statistical errors. The results suggest that QJ articles do diverge from the disciplinary standards set by peer-reviewed journals in psychology on quantitative indicators of quality that tend to reflect the effect of peer review and editorial processes. However, mid-tier and WoS journals are also affected by potential quality concerns, such as under-reporting of ethics and informed consent processes and the presence of errors in interpreting statistics. Further research is required to develop a comprehensive understanding of the quality of articles in QJs.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101640"},"PeriodicalIF":3.4,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A measure and the related models for characterizing the usage of academic journal","authors":"Lili Qiao , Star X. Zhao , Yutong Ji , Wu Li","doi":"10.1016/j.joi.2025.101643","DOIUrl":"10.1016/j.joi.2025.101643","url":null,"abstract":"<div><div>Based on the underlying usage data given by the <em>Web of Science</em>, we establish a novel metric, termed U<sub>h</sub>-index for multi-dimensional assessment of academic journals. Our research objectively examines the empirical and theoretical dimensions of the U<sub>h</sub>-index, assessing its validity and potential use in scientific evaluation. For this study, we conducted a quantitative analysis of the U<sub>h</sub>-index for 1,603 journals across the fields of physics, chemistry, economics, and management, and explored potential theory models. It reveals that the U<sub>h</sub>-index, as a literature metric based on usage data, is more sensitive and discriminatory compared to the h-index, which relies solely on citation data. Additionally, the U<sub>h</sub>-index and paper usage data were consistent with both the Glänzel–Schubert and the power-law model. It indicates that the U<sub>h</sub> index, as an impact observatory index, aligns with the fundamental principles of scientific knowledge dissemination, thereby holding significant scientific value. It facilitates the quantification of dissemination characteristics of core articles in journals, laying the foundation for a novel approach to categorizing and evaluating journals based on both theoretical orientation and practical application. Finally, from a multidimensional research evaluation perspective, the U<sub>h</sub> index offers a transitional dimension for observation, bridging the gap between academic citations and the broader dissemination of research through on social media.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101643"},"PeriodicalIF":3.4,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yejin Park , Seonkyu Lim , Changdai Gu , Arida Ferti Syafiandini , Min Song
{"title":"Forecasting topic trends of blockchain utilizing topic modeling and deep learning-based time-series prediction on different document types","authors":"Yejin Park , Seonkyu Lim , Changdai Gu , Arida Ferti Syafiandini , Min Song","doi":"10.1016/j.joi.2025.101639","DOIUrl":"10.1016/j.joi.2025.101639","url":null,"abstract":"<div><div>Topic trends in rapidly evolving domains like blockchain are dynamic and pose prediction challenges. To address this, we propose a novel framework that integrates topic modeling, clustering, and time-series deep learning models. These models include both non-graph-based and graph-based approaches. Blockchain-related documents of three types—academic papers, patents, and news articles—are collected and preprocessed. Random and topic subgraphs are constructed as inputs for model training and forecasting across various time epochs. The four models (LSTM, GRU, AGCRN, and A3T-GCN) are trained on random subgraphs, and the trained models forecast topic trends using topic subgraphs. We also analyze the distinctive characteristics of each document type and investigate the causal relationships between them. The results indicate that non-graph-based models, such as LSTM, perform better on periodic data like academic papers, whereas graph-based models, such as AGCRN and A3T-GCN, excel at capturing non-periodic patterns in patents and news articles. Our framework demonstrates robust performance, offering a versatile tool for blockchain-related trend analysis and forecasting. The code and environments are available at <span><span>https://github.com/textmining-org/topic-forecasting</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101639"},"PeriodicalIF":3.4,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Two separated worlds: On the preference of influence in life science and biomedical research","authors":"Zuguang Gu","doi":"10.1016/j.joi.2025.101641","DOIUrl":"10.1016/j.joi.2025.101641","url":null,"abstract":"<div><div>We introduced a new metric, “citation enrichment”, to measure country-to-country influence using citation data. This metric evaluates the degree to which a country prefers to cite another country compared to a random citation process. We applied the citation enrichment method to over 12 million publications in the life science and biomedical fields and we have the following key findings: 1) The global scientific landscape is divided into two separated worlds where developed Western countries exhibit an overall mutual under-influence with the rest of the world; 2) Within each world, countries form clusters based on their mutual citation preferences, with these groupings strongly associated with their geographical and cultural proximity; 3) The two worlds exhibit distinct patterns of the influence balance among countries, revealing underlying mechanisms that drive influence dynamics. We have constructed a comprehensive world map of scientific influence which greatly enhances the deep understanding of the international exchange of scientific knowledge. The citation enrichment metric is developed under a well-defined statistical framework and has the potential to be extended into a versatile and powerful tool for bibliometrics and related research fields.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 2","pages":"Article 101641"},"PeriodicalIF":3.4,"publicationDate":"2025-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143161178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen
{"title":"A comprehensive comparative analysis of publication monopoly phenomenon in scientific journals","authors":"Chengjun Zhang , ZhengJu Ren , Gaofeng Xiang , Wenbin Yu , Zeyu Xu , Jin Liu , Yadang Chen","doi":"10.1016/j.joi.2024.101628","DOIUrl":"10.1016/j.joi.2024.101628","url":null,"abstract":"<div><div>The increasing number of academic practitioners has resulted in a significantly increased volume of scientific papers, attracting considerable interest among researchers examining this correlation. However, little research has been devoted to the phenomenon of scientists monopolizing authorship in academic journals. This study thus introduces the term Publication Monopoly (PM) to describe this effect. The study refers to the prolific authors as Monopoly Authors. In addition, it proposes a Monopoly Index to assess PM severity. For each journal, the Monopoly Contribution (MC) quantifies the impact of Monopoly Authors. Using the Open Academic Graph dataset, our analysis explores the prevalence of PM and the corresponding MC in selected journals and academic fields. The findings demonstrate a positive relationship between the number of articles published and the likelihood of PM occurrence in most journals. Furthermore, fields relying heavily on laboratory environments or specialized equipment are particularly susceptible to PM. Additionally, once a journal becomes entrenched in PM, it is challenging to alleviate this phenomenon over time. Our study of PM aimed to prompt academic practitioners to carefully consider the likelihood of acceptance in journals characterized by high PM levels. Moreover, the study encourages journals to reconsider their need to accept more articles from Monopoly Authors.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101628"},"PeriodicalIF":3.4,"publicationDate":"2024-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Leveraging patent classification based on deep learning: The case study on smart cities and industrial Internet of Things","authors":"Munan Li , Liang Wang","doi":"10.1016/j.joi.2024.101616","DOIUrl":"10.1016/j.joi.2024.101616","url":null,"abstract":"<div><div>With the trends of technology convergence and technology interdisciplinarity, technology-field (TF) resolution and classification of patents have gradually been challenged. Whether for patent applicants or for patent examiners, more precisely labeling the TF for a certain patent is important for technological searches. However, determining the TF of a patent may be difficult and may even involve the strategic behavior of patenting, which can cause noise in patent classification systems (PCSs). In addition, some specific patents could contain more TFs than claimed or be assigned questionable IPC codes; subsequently, in a regular search for technology/patents, information could be missed. Considering the advantages of deep learning compared with traditional machine learning algorithms in areas such as natural language processing (NLP), text classification and text sentiment analysis, this paper investigates several popular deep learning models and proposes a large-scale multilabel regression (MLR) model to handle specific patent analyses under situations of small sample learning. To verify the proposed MLR model for patent classification, the case study on smart cities and industrial Internet of Things (IIoT) is conducted. The MLR experiments on the TF resolution of smart cities and IIoT have yielded moderate results compared with those of the latest patent classification studies, which also rely on deep learning and the large language models (LLMs), which include RCNN, Bi-LSTM, BERT and GPT-4 etc. Therefore, the proposed MLR model with a customized loss function could be moderately effective for patent classification within a specific technology theme, could have implications for patent classification and the TF resolution of patents, and could further enrich methodologies for patent mining and informetrics based on artificial intelligence (AI).</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101616"},"PeriodicalIF":3.4,"publicationDate":"2024-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142747971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang
{"title":"Linkages among science, technology, and industry on the basis of main path analysis","authors":"Shuo Xu , Zhen Liu , Xin An , Hong Wang , Hongshen Pang","doi":"10.1016/j.joi.2024.101617","DOIUrl":"10.1016/j.joi.2024.101617","url":null,"abstract":"<div><div>Compared to the science-technology linkages, the linkages among science, technology, and industry are largely under-studied. Therefore, this paper proposes a main path analysis based framework to discover the science-technology-industry linkages, in which scientific publications, patents, and products are viewed as respective proxies of scientific research, technological advance, and industrial development. To validate the feasibility and effectiveness of our framework, after the DrugBank dataset in pharmaceutical industry was downloaded in XML form on 1 November 2019, this dataset is further enriched, drug entity mentions are recognized from scholarly articles and patents, and several citation cycles are eliminated. The scientific publications span from 1871 to 2019, and patents from 1953 to 2019. There are 8,421, 5,590, and 2,136 article, patent, and drug nodes and 41,200 citations in the largest weakly connected component of the constructed heterogeneous citation network. From empirical analysis on the largest weakly connected component, main conclusions can be drawn as follows. (1) The discovered developmental trajectories indeed encode the interactions among science, technology, and industry. Science and technology not only interact with each other, but also jointly promote the development of the industry, and the industry, in turn, influences the advancement of science and technology. (2) The developmental modes in the pharmaceutical industry can be grouped into three categories: pushed by only science, pushed by only technology, and pushed by science and technology simultaneously. (3) The drugs bridge scientific research and technological advance, and thereby help enhance knowledge exchanges between science and technology and shorten the cycle of drug development. This study contributes to discovering the linkages among science, technology, and industry from the perspective of mutual citations among scholarly articles, patents, and products. However, a scientific verification of our framework in other industries apart from pharmaceutical industry still needs to be further investigated.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101617"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Citation counts and inclusion of references in seven free-access scholarly databases: A comparative analysis","authors":"Lorena Delgado-Quirós , José Luis Ortega","doi":"10.1016/j.joi.2024.101618","DOIUrl":"10.1016/j.joi.2024.101618","url":null,"abstract":"<div><div>The aim of this study is to examine disparities in citation counts amongst scholarly databases and the reasons that contribute to these differences. A random Crossref sample of >115k DOIs was selected and subsequently searched across six databases (Dimensions, Google Scholar, Microsoft Academic, Scilit, Semantic Scholar and The Lens). In July 2021, citation counts and lists of references were extracted from each database for comparative processing and analysis. The findings indicate that publications in Crossref-based databases (Crossref, Dimensions, Scilit and The Lens) have similar citation counts, while documents in search engines (Google Scholar, Microsoft Academic and Semantic Scholar) have a higher number of citations due to a greater coverage of publications, but also to the integration of web copies. Analysis of references has revealed that Scilit only extracts references with Digital Object Identifiers (DOI) and that Semantic Scholar causes significant problems when it adds references from external web versions. Ultimately, the study has shown that all the databases struggle to index references from books and book chapters, which may be attributable to certain academic publishers. The study concludes with a discussion of the potential effects on research evaluation that may arise from this lack of citations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101618"},"PeriodicalIF":3.4,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding
{"title":"Gender differences in dropout rate: From field, career status, and generation perspectives","authors":"Yunhan Yang , Chenwei Zhang , Huimin Xu , Yi Bu , Meijun Liu , Ying Ding","doi":"10.1016/j.joi.2024.101615","DOIUrl":"10.1016/j.joi.2024.101615","url":null,"abstract":"<div><div>The dropout of scholars poses risks by depleting valuable resources and hindering the scientific community. Knowledge gaps on this issue lack consistency across career statuses and overlook its dynamic nature. To address this gap, we analyzed the career trajectories of over 24 million scholars in 19 fields from the MAG dataset, examining dropout rates by field, career status, and generation. Firstly, we observed an unexpectedly high proportion of transients, comprising a growing proportion of newcomers and accounting for over 50% of publications in most soft sciences. This highlights the shortage of continuants, such as scholars with full careers, who contribute to scientific communities. Secondly, our exploration into gender-specific dropout rates revealed that women exhibit a significantly higher dropout rates within the first 20 years, covering career statuses including junior dropout, early-career dropout, and mid-career dropouts. Notably, early- and mid-career dropouts demonstrate the lowest and most stable dropout rates. These insights prompted the development of a gendered scientific career model that combines changes in scholar numbers and dropout rates across career statuses. Lastly, our generational analysis spanning four generations unveiled a diminishing gender gap in dropout rates. In hard sciences, women encounter initial career challenges, with the gender gap in dropout rates decreasing over time. In contrast, the gender gap in soft sciences persists longer. These findings hold consistent across six subfields, offering implications for field evaluation, gender disparities policies, and a deeper understanding of scholarly dropout across generations.</div></div>","PeriodicalId":48662,"journal":{"name":"Journal of Informetrics","volume":"19 1","pages":"Article 101615"},"PeriodicalIF":3.4,"publicationDate":"2024-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142701969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}