{"title":"How reliable are unsupervised author disambiguation algorithms in the assessment of research organization performance?","authors":"G. Abramo, Ciriaco Andrea D’Angelo","doi":"10.1162/qss_a_00236","DOIUrl":"https://doi.org/10.1162/qss_a_00236","url":null,"abstract":"Abstract Assessing the performance of universities by output to input indicators requires knowledge of the individual researchers working within them. Although in Italy the Ministry of University and Research updates a database of university professors, in all those countries where such databases are not available, measuring research performance is a formidable task. One possibility is to trace the research personnel of institutions indirectly through their publications, using bibliographic repertories together with author names disambiguation algorithms. This work evaluates the goodness-of-fit of the Caron and van Eck, CvE unsupervised algorithm by comparing the research performance of Italian universities resulting from its application for the derivation of the universities’ research staff, with that resulting from the supervised algorithm of D’Angelo, Giuffrida, and Abramo (2011), which avails of input data. Results show that the CvE algorithm overestimates the size of the research staff of organizations by 56%. Nonetheless, the performance scores and ranks recorded in the two compared modes show a significant and high correlation. Still, nine out of 69 universities show rank deviations of two quartiles. Measuring the extent of distortions inherent in any evaluation exercises using unsupervised algorithms, can inform policymakers’ decisions on building national research staff databases, instead of settling for the unsupervised approaches.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"144-166"},"PeriodicalIF":6.4,"publicationDate":"2022-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45627516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A converging global research system","authors":"Jonathan Adams, M. Szomszor","doi":"10.1162/qss_a_00208","DOIUrl":"https://doi.org/10.1162/qss_a_00208","url":null,"abstract":"Abstract We examine the hypothesis that research collaboration has enabled a global research network to evolve, with self-organizing properties transcending national research policy. We examine research output, bilateral and multilateral collaboration, subject diversity, and citation impact over 40 years, in detail for the G7 and BRICK groups of countries and in summary for 26 other nations. We find that the rise in national output was strongly associated with bilateral collaboration until the 2000s but after that by multilateral partnerships, with the shift happening at much the same time across countries. There was a general increase in research subject diversity, with evenness across subjects converging on a similar index value for many countries. Similar diversity is not the same as actual similarity but, in fact, the G7 countries became increasingly similar. National average citation impact (CNCI) rose and groups converged on similar impact values. The impact of the largest economies is above world average, which is a phenomenon we discuss separately. The similarities in patterns and timing occur across countries despite variance in their research policies, such as research assessment. We suggest that the key agent facilitating global network self-organization is a shared concept of best practice in research.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"715-731"},"PeriodicalIF":6.4,"publicationDate":"2022-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44132027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Recategorising research: Mapping from FoR 2008 to FoR 2020 in Dimensions","authors":"Simon Porter, Daniel W. Hook","doi":"10.1162/qss_a_00244","DOIUrl":"https://doi.org/10.1162/qss_a_00244","url":null,"abstract":"Abstract In 2020 the Australia New Zealand Standard Research Classification Fields of Research Codes (ANZSRC FoR codes) were updated by their owners. This has led the sector to need to update their systems of reference and has caused suppliers working in the research information sphere to need to update both systems and data. This paper focuses on the approach developed by Digital Science’s Dimensions team to the creation of an improved machine-learning training set, and the mapping of that set from FoR 2008 codes to FoR 2020 codes so that the Dimensions classification approach for the ANZSRC codes could be improved and updated.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"127-143"},"PeriodicalIF":6.4,"publicationDate":"2022-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46767203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Peer reviewer topic choice and its impact on interrater reliability: A mixed-method study","authors":"Thomas Feliciani, Junwen Luo, K. Shankar","doi":"10.1162/qss_a_00207","DOIUrl":"https://doi.org/10.1162/qss_a_00207","url":null,"abstract":"Abstract One of the main critiques of academic peer review is that interrater reliability (IRR) among reviewers is low. We examine an underinvestigated factor possibly contributing to low IRR: reviewers’ diversity in their topic-criteria mapping (“TC-mapping”). It refers to differences among reviewers pertaining to which topics they choose to emphasize in their evaluations, and how they map those topics onto various evaluation criteria. In this paper we look at the review process of grant proposals in one funding agency to ask: How much do reviewers differ in TC-mapping, and do their differences contribute to low IRR? Through a content analysis of review forms submitted to a national funding agency (Science Foundation Ireland) and a survey of its reviewers, we find evidence of interreviewer differences in their TC-mapping. Using a simulation experiment we show that, under a wide range of conditions, even strong differences in TC-mapping have only a negligible impact on IRR. Although further empirical work is needed to corroborate simulation results, these tentatively suggest that reviewers’ heterogeneous TC-mappings might not be of concern for designers of peer review panels to safeguard IRR.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"832-856"},"PeriodicalIF":6.4,"publicationDate":"2022-08-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45723631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An open data set of scholars on Twitter","authors":"P. Mongeon, T. Bowman, R. Costas","doi":"10.1162/qss_a_00250","DOIUrl":"https://doi.org/10.1162/qss_a_00250","url":null,"abstract":"Abstract The role played by research scholars in the dissemination of scientific knowledge on social media has always been a central topic in social media metrics (altmetrics) research. Different approaches have been implemented to identify and characterize active scholars on social media platforms like Twitter. Some limitations of past approaches were their complexity and, most importantly, their reliance on licensed scientometric and altmetric data. The emergence of new open data sources such as OpenAlex or Crossref Event Data provides opportunities to identify scholars on social media using only open data. This paper presents a novel and simple approach to match authors from OpenAlex with Twitter users identified in Crossref Event Data. The matching procedure is described and validated with ORCID data. The new approach matches nearly 500,000 matched scholars with their Twitter accounts with a level of high precision and moderate recall. The data set of matched scholars is described and made openly available to the scientific community to empower more advanced studies of the interactions of research scholars on Twitter.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"314-324"},"PeriodicalIF":6.4,"publicationDate":"2022-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48774891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visualizing academic descendants using modified Pavlo diagrams: Results based on five researchers in biomechanics and biomedicine","authors":"W. Lievers","doi":"10.1162/qss_a_00205","DOIUrl":"https://doi.org/10.1162/qss_a_00205","url":null,"abstract":"Abstract Visualizing the academic descendants of prolific researchers is a challenging problem. To this end, a modified Pavlo algorithm is presented and its utility is demonstrated based on manually collected academic genealogies of five researchers in biomechanics and biomedicine. The researchers have 15–32 children each and between 93 and 384 total descendants. The graphs generated by the modified algorithm were over 97% smaller than the original. Mentorship metrics were also calculated; their hm-indices are 5–7 and the gm-indices are in the range 7–13. Of the 1,096 unique researchers across the five family trees, 153 (14%) had graduated their own PhD students by the end of 2021. It took an average of 9.6 years after their own graduation for an advisor to graduate their first PhD student, which suggests that an academic generation in this field is approximately one decade. The manually collected data sets used were also compared against the crowd-sourced academic genealogy data from the AcademicTree.org website. The latter included only 45% of the people and 34% of the connections, so this limitation must be considered when using it for analyses where completeness is required. The data sets and an implementation of the algorithm are available for reuse.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"489-511"},"PeriodicalIF":6.4,"publicationDate":"2022-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64426870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Measuring and interpreting the differences of the nations’ scientific specialization indexes by output and by input","authors":"G. Abramo, Ciriaco Andrea D’Angelo, F. D. Costa","doi":"10.1162/qss_a_00206","DOIUrl":"https://doi.org/10.1162/qss_a_00206","url":null,"abstract":"Abstract This paper compares the national scientific profiles of 199 countries in 254 fields, tracked by two indices of scientific specialization based respectively on indicators of input and output. For each country, the indicator of inputs considers the number of researchers in each field. The output indicator, named Total Fractional Impact, based on the citations of publications indexed in the Web of Science, measures the scholarly impact of knowledge produced in each field. For each country, the approach allows us to measure the deviations between the two profiles, thereby revealing potential differences in research efficiency and/or capital allocation across fields, compared to benchmark countries.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"755-775"},"PeriodicalIF":6.4,"publicationDate":"2022-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47923680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fakhri Momeni, S. Dietze, Philipp Mayr, Kristin Biesenbender, Isabella Peters
{"title":"Which factors are associated with Open Access publishing? A Springer Nature case study","authors":"Fakhri Momeni, S. Dietze, Philipp Mayr, Kristin Biesenbender, Isabella Peters","doi":"10.1162/qss_a_00253","DOIUrl":"https://doi.org/10.1162/qss_a_00253","url":null,"abstract":"Abstract Open Access (OA) facilitates access to research articles. However, authors or funders often must pay the publishing costs, preventing authors who do not receive financial support from participating in OA publishing and gaining citation advantage for OA articles. OA may exacerbate existing inequalities in the publication system rather than overcome them. To investigate this, we studied 522,411 articles published by Springer Nature. Employing correlation and regression analyses, we describe the relationship between authors affiliated with countries from different income levels, their choice of publishing model, and the citation impact of their papers. A machine learning classification method helped us to explore the importance of different features in predicting the publishing model. The results show that authors eligible for article processing charge (APC) waivers publish more in gold OA journals than others. In contrast, authors eligible for an APC discount have the lowest ratio of OA publications, leading to the assumption that this discount insufficiently motivates authors to publish in gold OA journals. We found a strong correlation between the journal rank and the publishing model in gold OA journals, whereas the OA option is mostly avoided in hybrid journals. Also, results show that the countries’ income level, seniority, and experience with OA publications are the most predictive factors for OA publishing in hybrid journals.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"4 1","pages":"353-371"},"PeriodicalIF":6.4,"publicationDate":"2022-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48857752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An improved practical approach to forecasting exceptional growth in research","authors":"K. Boyack, R. Klavans","doi":"10.1162/qss_a_00202","DOIUrl":"https://doi.org/10.1162/qss_a_00202","url":null,"abstract":"Abstract The accurate forecasting of exceptional growth in research areas has been an extremely difficult problem to solve. In a previous study we introduced an approach to forecasting which research clusters in a global model of the scientific literature would have an annual growth rate of 8% annually over a 3-year period. In this study we (a) introduce a much more robust method of creating and updating global models of research, (b) introduce new indicators based on author publication patterns, (c) test a much larger set (81) of indicators to forecast exceptional growth, and (d) expand the forecast horizon from 3 to 4 years. Forecast accuracy increased dramatically (threat score increased from 20 to 32) from our previous study. Most of this gain is surprisingly due to the advances in model robustness rather than the indicators used for forecasting. We also provide evidence that most indicators (including popular network indicators) do not improve the ability to forecast growth in research above the baseline provided by indicators associated with the vitality of a research cluster.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"672-693"},"PeriodicalIF":6.4,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41576552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"AOC: Assembling overlapping communities","authors":"Akhil Jakatdar, T. Warnow, George Chacko","doi":"10.1162/qss_a_00227","DOIUrl":"https://doi.org/10.1162/qss_a_00227","url":null,"abstract":"Abstract Through discovery of mesoscale structures, community detection methods contribute to the understanding of complex networks. Many community finding methods, however, rely on disjoint clustering techniques, in which node membership is restricted to one community or cluster. This strict requirement limits the ability to inclusively describe communities because some nodes may reasonably be assigned to multiple communities. We have previously reported Iterative K-core Clustering, a scalable and modular pipeline that discovers disjoint research communities from the scientific literature. We now present Assembling Overlapping Clusters (AOC), a complementary metamethod for overlapping communities, as an option that addresses the disjoint clustering problem. We present findings from the use of AOC on a network of over 13 million nodes that captures recent research in the very rapidly growing field of extracellular vesicles in biology.","PeriodicalId":34021,"journal":{"name":"Quantitative Science Studies","volume":"3 1","pages":"1079-1096"},"PeriodicalIF":6.4,"publicationDate":"2022-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45319612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}