国家科学界合作作者数据的质量问题

IF 1.4 Q2 SOCIAL SCIENCES, INTERDISCIPLINARY
Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin
{"title":"国家科学界合作作者数据的质量问题","authors":"Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin","doi":"10.1017/nws.2022.40","DOIUrl":null,"url":null,"abstract":"Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.","PeriodicalId":51827,"journal":{"name":"Network Science","volume":null,"pages":null},"PeriodicalIF":1.4000,"publicationDate":"2023-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Quality issues in co-authorship data of a national scientific community\",\"authors\":\"Domenico De Stefano, V. Fuccella, M. P. Vitale, S. Zaccarin\",\"doi\":\"10.1017/nws.2022.40\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.\",\"PeriodicalId\":51827,\"journal\":{\"name\":\"Network Science\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2023-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Network Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1017/nws.2022.40\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"SOCIAL SCIENCES, INTERDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Network Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1017/nws.2022.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"SOCIAL SCIENCES, INTERDISCIPLINARY","Score":null,"Total":0}
引用次数: 1

摘要

作为学者合作行为的代理,关于共同作者的研究流侧重于特定科学共同体的成员,这些成员以学科和/或国家为基础,必须检索共同作者数据。最近的文献指出,国际数字图书馆提供了部分覆盖整个学者的科学成果,以及对社区学者的覆盖不足。检索共同作者数据的偏见会从几个方面影响网络建设和网络措施,从而提供了学者之间真正合作撰写论文的部分情况。在这篇文章中,我们从大多数大学可用的在线平台(IRIS)收集了意大利学术统计学家的书目记录。即使保证了我国人口及其科学生产的高覆盖率,也需要处理一些数据质量问题。因此,提出了一种基于半自动工具检索出版物元数据的web抓取程序,以及基于数据管理工具检测重复记录和协调作者的web抓取程序。根据我们的程序,意大利学术统计学家的合作是一种积极的、日益增加的做法,根据性别、学术排名和学者所在大学的位置,合作存在一些差异。在IRIS平台中完成数据质量问题的启发式过程可以代表一个工作案例报告,以适应其他具有相似特征的书目档案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Quality issues in co-authorship data of a national scientific community
Abstract A stream of research on co-authorship, used as a proxy of scholars’ collaborative behavior, focuses on members of a given scientific community defined at discipline and/or national basis for which co-authorship data have to be retrieved. Recent literature pointed out that international digital libraries provide partial coverage of the entire scholar scientific production as well as under-coverage of the scholars in the community. Bias in retrieving co-authorship data of the community of interest can affect network construction and network measures in several ways, providing a partial picture of the real collaboration in writing papers among scholars. In this contribution, we collected bibliographic records of Italian academic statisticians from an online platform (IRIS) available at most universities. Even if it guarantees a high coverage rate of our population and its scientific production, it is necessary to deal with some data quality issues. Thus, a web scraping procedure based on a semi-automatic tool to retrieve publication metadata, as well as data management tools to detect duplicate records and to reconcile authors, is proposed. As a result of our procedure, it emerged that collaboration is an active and increasing practice for Italian academic statisticians with some differences according to the gender, the academic ranking, and the university location of scholars. The heuristic procedure to accomplish data quality issues in the IRIS platform can represent a working case report to adapt to other bibliographic archives with similar characteristics.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Network Science
Network Science SOCIAL SCIENCES, INTERDISCIPLINARY-
CiteScore
3.50
自引率
5.90%
发文量
24
期刊介绍: Network Science is an important journal for an important discipline - one using the network paradigm, focusing on actors and relational linkages, to inform research, methodology, and applications from many fields across the natural, social, engineering and informational sciences. Given growing understanding of the interconnectedness and globalization of the world, network methods are an increasingly recognized way to research aspects of modern society along with the individuals, organizations, and other actors within it. The discipline is ready for a comprehensive journal, open to papers from all relevant areas. Network Science is a defining work, shaping this discipline. The journal welcomes contributions from researchers in all areas working on network theory, methods, and data.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信