{"title":"Towards a moralization of bibliometrics? A response to Kyle Siler","authors":"Y. Gingras","doi":"10.1162/qss_c_00178","DOIUrl":"https://doi.org/10.1162/qss_c_00178","url":null,"abstract":"Abstract In a recent letter to QSS, Kyle Siler (2021), made harsh comments against the decision of the editors to publish a controversial paper signed by Alessandro Strumia (2021) about gender differences in high-energy physics. My aim here is to point to the elements in Siler’s letter that are typical of a new tendency to replace rational and technical arguments with a series of moral statements and ex cathedra affirmations that are not supported by cogent arguments. Such an approach can only be detrimental to rational debates within the bibliometric research community.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"315-318"},"PeriodicalIF":6.4,"publicationDate":"2022-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42164765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The management of scientific and technological infrastructures: The case of the Mexican National Laboratories","authors":"Leonardo Munguía, E. Robles-Belmont, J. Escalante","doi":"10.1162/qss_a_00230","DOIUrl":"https://doi.org/10.1162/qss_a_00230","url":null,"abstract":"Abstract The effectiveness of research units is assessed on the basis of their performance in relation to scientific, technological, and innovation production, the quality of their results, and their contribution to the solution of scientific and social problems. We examine the management practices employed in some Mexican National Laboratories to identify those practices that could explain their effectiveness in meeting their objectives. The results of other research that propose common elements among laboratories with outstanding performance are used and verified directly in the field. Considering the inherent complexity of each field of knowledge and the sociospatial characteristics in which the laboratories operate, we report which management practices are relevant for their effectiveness, how they contribute to their consolidation as fundamental scientific and technological infrastructures, and how these can be translated into indicators that support the evaluation of their performance.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"246-261"},"PeriodicalIF":6.4,"publicationDate":"2022-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42055891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Alexis-Michel Mugabushaka, Nees Jan van Eck, L. Waltman
{"title":"Funding COVID-19 research: Insights from an exploratory analysis using open data infrastructures","authors":"Alexis-Michel Mugabushaka, Nees Jan van Eck, L. Waltman","doi":"10.1162/qss_a_00212","DOIUrl":"https://doi.org/10.1162/qss_a_00212","url":null,"abstract":"Abstract To analyze the outcomes of the funding they provide, it is essential for funding agencies to be able to trace the publications resulting from their funding. We study the open availability of funding data in Crossref, focusing on funding data for publications that report research related to COVID-19. We also present a comparison with the funding data available in two proprietary bibliometric databases: Scopus and Web of Science. Our analysis reveals limited coverage of funding data in Crossref. It also shows problems related to the quality of funding data, especially in Scopus. We offer recommendations for improving the open availability of funding data in Crossref.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"560-582"},"PeriodicalIF":6.4,"publicationDate":"2022-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45834216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Wedell, Minhyuk Park, Dmitriy Korobskiy, T. Warnow, George Chacko
{"title":"Center–periphery structure in research communities","authors":"E. Wedell, Minhyuk Park, Dmitriy Korobskiy, T. Warnow, George Chacko","doi":"10.1162/qss_a_00184","DOIUrl":"https://doi.org/10.1162/qss_a_00184","url":null,"abstract":"Abstract Clustering and community detection in networks are of broad interest and have been the subject of extensive research that spans several fields. We are interested in the relatively narrow question of detecting communities of scientific publications that are linked by citations. These publication communities can be used to identify scientists with shared interests who form communities of researchers. Building on the well-known k-core algorithm, we have developed a modular pipeline to find publication communities with center–periphery structure. Using a quantitative and qualitative approach, we evaluate community finding results on a citation network consisting of over 14 million publications relevant to the field of extracellular vesicles. We compare our approach to communities discovered by the widely used Leiden algorithm for community finding.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"214 1","pages":"289-314"},"PeriodicalIF":6.4,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73980597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Microsoft Academic Knowledge Graph enhanced: Author name disambiguation, publication classification, and embeddings","authors":"Michael Färber, Lin Ao","doi":"10.1162/qss_a_00183","DOIUrl":"https://doi.org/10.1162/qss_a_00183","url":null,"abstract":"Abstract Although several large knowledge graphs have been proposed in the scholarly field, such graphs are limited with respect to several data quality dimensions such as accuracy and coverage. In this article, we present methods for enhancing the Microsoft Academic Knowledge Graph (MAKG), a recently published large-scale knowledge graph containing metadata about scientific publications and associated authors, venues, and affiliations. Based on a qualitative analysis of the MAKG, we address three aspects. First, we adopt and evaluate unsupervised approaches for large-scale author name disambiguation. Second, we develop and evaluate methods for tagging publications by their discipline and by keywords, facilitating enhanced search and recommendation of publications and associated entities. Third, we compute and evaluate embeddings for all 239 million publications, 243 million authors, 49,000 journals, and 16,000 conference entities in the MAKG based on several state-of-the-art embedding techniques. Finally, we provide statistics for the updated MAKG. Our final MAKG is publicly available at https://makg.org and can be used for the search or recommendation of scholarly entities, as well as enhanced scientific impact quantification.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"51-98"},"PeriodicalIF":6.4,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47031666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Can the quality of published academic journal articles be assessed with machine learning?","authors":"M. Thelwall","doi":"10.1162/qss_a_00185","DOIUrl":"https://doi.org/10.1162/qss_a_00185","url":null,"abstract":"Abstract Formal assessments of the quality of the research produced by departments and universities are now conducted by many countries to monitor achievements and allocate performance-related funding. These evaluations are hugely time consuming if conducted by postpublication peer review and are simplistic if based on citations or journal impact factors. I investigate whether machine learning could help reduce the burden of peer review by using citations and metadata to learn how to score articles from a sample assessed by peer review. An experiment is used to underpin the discussion, attempting to predict journal citation thirds, as a proxy for article quality scores, for all Scopus narrow fields from 2014 to 2020. The results show that these proxy quality thirds can be predicted with above baseline accuracy in all 326 narrow fields, with Gradient Boosting Classifier, Random Forest Classifier, or Multinomial Naïve Bayes being the most accurate in nearly all cases. Nevertheless, the results partly leverage journal writing styles and topics, which are unwanted for some practical applications and cause substantial shifts in average scores between countries and between institutions within a country. There may be scope for predicting articles’ scores when the predictions have the highest probability.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"208-226"},"PeriodicalIF":6.4,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49489149","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"See further upon the giants: Quantifying intellectual lineage in science","authors":"Woo Seong Jo, Lu Liu, Dashun Wang","doi":"10.1162/qss_a_00186","DOIUrl":"https://doi.org/10.1162/qss_a_00186","url":null,"abstract":"Abstract Newton’s centuries-old wisdom of standing on the shoulders of giants raises a crucial yet underexplored question: Out of all the prior works cited by a discovery, which one is its giant? Here, we develop a discipline-independent method to identify the giant for any individual paper, allowing us to better understand the role and characteristics of giants in science. We find that across disciplines, about 95% of papers appear to stand on the shoulders of giants, yet the weight of scientific progress rests on relatively few shoulders. Defining a new measure of giant index, we find that, while papers with high citations are more likely to be giants, for papers with the same citations, their giant index sharply predicts a paper’s future impact and prize-winning probabilities. Giants tend to originate from both small and large teams, being either highly disruptive or highly developmental. Papers that did not have a giant tend to do poorly on average, yet interestingly, if such papers later became a giant for other papers, they tend to be home-run papers that are highly disruptive to science. Given the crucial importance of citation-based measures in science, the developed concept of giants may offer a useful dimension in assessing scientific impact that goes beyond sheer citation counts.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"319-330"},"PeriodicalIF":6.4,"publicationDate":"2022-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45165060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"German cities with universities: Socioeconomic position and university performance","authors":"A. V. van Raan","doi":"10.1162/qss_a_00182","DOIUrl":"https://doi.org/10.1162/qss_a_00182","url":null,"abstract":"Abstract A much-debated topic is the role of universities in the prosperity of cities and regions. Two major problems arise. First, what is a reliable measurement of prosperity? And second, what are the characteristics, particularly research performance, of a university that matter? I focus on this research question: Is there a significant relation between having a university and a city’s socioeconomic strength? And if so, what are the determining indicators of a university; for instance, how important is scientific collaboration? What is the role of scientific quality measured by citation impact? Does the size of a university, measured in number of publications or in number of students matter? I compiled a database of city and university data: gross urban product and population data of nearly 200 German cities and 400 districts. University data are derived from the Leiden Ranking 2020 and supplemented with data on the number of students. The socioeconomic strength of a city is determined using the urban scaling methodology. My study shows a significant relation between the presence of a university in a city and its socioeconomic indicators, particularly for larger cities, and that this is especially the case for universities with higher values of their output, impact and collaboration indicators.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"265-288"},"PeriodicalIF":6.4,"publicationDate":"2022-02-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43520853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Continued use of retracted papers: Temporal trends in citations and (lack of) awareness of retractions shown in citation contexts in biomedicine.","authors":"Tzu-Kun Hsiao, Jodi Schneider","doi":"10.1162/qss_a_00155","DOIUrl":"10.1162/qss_a_00155","url":null,"abstract":"<p><p>We present the first database-wide study on the citation contexts of retracted papers, which covers 7,813 retracted papers indexed in PubMed, 169,434 citations collected from iCite, and 48,134 citation contexts identified from the XML version of the PubMed Central Open Access Subset. Compared with previous citation studies that focused on comparing citation counts using two time frames (i.e., preretraction and postretraction), our analyses show the longitudinal trends of citations to retracted papers in the past 60 years (1960-2020). Our temporal analyses show that retracted papers continued to be cited, but that old retracted papers stopped being cited as time progressed. Analysis of the text progression of pre- and postretraction citation contexts shows that retraction did not change the way the retracted papers were cited. Furthermore, among the 13,252 postretraction citation contexts, only 722 (5.4%) citation contexts acknowledged the retraction. In these 722 citation contexts, the retracted papers were most commonly cited as related work or as an example of problematic science. Our findings deepen the understanding of why retraction does not stop citation and demonstrate that the vast majority of postretraction citations in biomedicine do not document the retraction.</p>","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"2 4","pages":"1144-1169"},"PeriodicalIF":4.1,"publicationDate":"2022-02-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9520488/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40391371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying scientific publications countrywide and measuring their open access: The case of the French Open Science Barometer (BSO)","authors":"Lauranne Chaignon, D. Egret","doi":"10.1162/qss_a_00179","DOIUrl":"https://doi.org/10.1162/qss_a_00179","url":null,"abstract":"Abstract We use several sources to collect and evaluate academic scientific publication on a country-wide scale, and we apply it to the case of France for the years 2015–2020, while presenting a more detailed analysis focused on the reference year 2019. These sources are diverse: databases available by subscription (Scopus, Web of Science) or open to the scientific community (Microsoft Academic Graph), the national open archive HAL, and databases serving thematic communities (ADS and PubMed). We show the contribution of the different sources to the final corpus. These results are then compared to those obtained with another approach, that of the French Open Science Barometer for monitoring open access at the national level. We show that both approaches provide a convergent estimate of the open access rate. We also present and discuss the definitions of the concepts used, and list the main difficulties encountered in processing the data. The results of this study contribute to a better understanding of the respective contributions of the main databases and their complementarity in the broad framework of a countrywide corpus. They also shed light on the calculation of open access rates and thus contribute to a better understanding of current developments in the field of open science.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"18-36"},"PeriodicalIF":6.4,"publicationDate":"2022-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46750125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}