Analysis of the evolution of scientific collaboration networks for the prediction of new co-authorships

IF 0.5 4区 管理学 Q3 INFORMATION SCIENCE & LIBRARY SCIENCE
Felipe M. Affonso, Monique de Oliveira Santiago, Thiago Magela Rodrigues Dias
{"title":"Analysis of the evolution of scientific collaboration networks for the prediction of new co-authorships","authors":"Felipe M. Affonso, Monique de Oliveira Santiago, Thiago Magela Rodrigues Dias","doi":"10.1590/2318-0889202234e200033","DOIUrl":null,"url":null,"abstract":"Abstract When publishing an article with other authors, initial links must be formed by a collaboration between authors, a scientific collaboration network. In this context, the papers are represented by the edges, and the authors are represented the nodes, forming a network. At this moment, the following question arises: How does the evolution of the network occur over time? Understanding what factors are essential for creating a new connection to answer this question is necessary. Therefore, the purpose of this article is to foresee connections in co-authorship networks formed by PhDs with curricula registered in Lattes Platform in the areas of Information Sciences and Biology. The following steps are performed: initially the data is extracted and organized. This step is essential for the continuity of the process. Then, co-authorship networks are generated based on articles published together. Subsequently, the attributes to be used are defined and some metrics are calculated. Finally, machine learning algorithms estimate future scientific collaborations in the selected areas. The Lattes Platform has 6.6 million resumes for researchers and represents one of the most relevant and recognized scientific repositories worldwide. As a result, random forest and logistic regression algorithms showed the highest hit rates, and preferential attachment attribute was identified as the most influential in the emergence of new scientific collaborations. Through the results, it is possible to establish the evolution of the network of scientific associations of researchers at a national level, assisting development agencies in selecting of future outstanding researchers.","PeriodicalId":44216,"journal":{"name":"Transinformacao","volume":"1 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transinformacao","FirstCategoryId":"91","ListUrlMain":"https://doi.org/10.1590/2318-0889202234e200033","RegionNum":4,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Abstract When publishing an article with other authors, initial links must be formed by a collaboration between authors, a scientific collaboration network. In this context, the papers are represented by the edges, and the authors are represented the nodes, forming a network. At this moment, the following question arises: How does the evolution of the network occur over time? Understanding what factors are essential for creating a new connection to answer this question is necessary. Therefore, the purpose of this article is to foresee connections in co-authorship networks formed by PhDs with curricula registered in Lattes Platform in the areas of Information Sciences and Biology. The following steps are performed: initially the data is extracted and organized. This step is essential for the continuity of the process. Then, co-authorship networks are generated based on articles published together. Subsequently, the attributes to be used are defined and some metrics are calculated. Finally, machine learning algorithms estimate future scientific collaborations in the selected areas. The Lattes Platform has 6.6 million resumes for researchers and represents one of the most relevant and recognized scientific repositories worldwide. As a result, random forest and logistic regression algorithms showed the highest hit rates, and preferential attachment attribute was identified as the most influential in the emergence of new scientific collaborations. Through the results, it is possible to establish the evolution of the network of scientific associations of researchers at a national level, assisting development agencies in selecting of future outstanding researchers.
科学合作网络的演变分析,用于预测新的合作伙伴关系
当与其他作者一起发表文章时,最初的链接必须由作者之间的合作形成,这是一个科学合作网络。在这种情况下,论文用边表示,作者用节点表示,形成一个网络。此时,出现了以下问题:随着时间的推移,网络的进化是如何发生的?为了回答这个问题,了解哪些因素对于建立新的关系至关重要。因此,本文的目的是预测由在拿铁平台注册的信息科学和生物学领域课程的博士组成的合作作者网络的联系。执行以下步骤:首先提取并组织数据。这一步骤对于该进程的连续性是必不可少的。然后,基于共同发表的文章生成共同作者网络。随后,定义要使用的属性并计算一些度量。最后,机器学习算法估计选定领域的未来科学合作。拿铁平台为研究人员提供了660万份简历,是全球最相关、最受认可的科学知识库之一。结果表明,随机森林和逻辑回归算法的准确率最高,优先依恋属性对新的科学合作的出现影响最大。通过这些结果,有可能在国家一级建立研究人员科学协会网络的演变,协助发展机构选择未来杰出的研究人员。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Transinformacao
Transinformacao INFORMATION SCIENCE & LIBRARY SCIENCE-
CiteScore
0.80
自引率
16.70%
发文量
16
审稿时长
36 weeks
期刊介绍: Transinformação es una revista cuatrimestral especializada, abierta a las contribuciones de la comunidad científica nacional e internacional y editada por la Facultad de Biblioteconomía y el Centro de Ciencias Humanas y Sociales Aplicadas de la Pontificia Universidad Católica de Campinas. Fundada en 1989, está clasificada en la lista Qualis como A1 y publica artículos que contribuyen al estudio y el desarrollo científico de las Ciencias de la Información, la Biblioteconomía, la Archivología, la Museología y sus áreas afines.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信