Automatic classification of scientific groups as productive: An approach based on motif analysis

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014) Pub Date : 2014-08-17 DOI:10.1109/ASONAM.2014.6921572

Tanmoy Chakraborty, Niloy Ganguly, Animesh Mukherjee

{"title":"Automatic classification of scientific groups as productive: An approach based on motif analysis","authors":"Tanmoy Chakraborty, Niloy Ganguly, Animesh Mukherjee","doi":"10.1109/ASONAM.2014.6921572","DOIUrl":null,"url":null,"abstract":"One of the key aspects instrumental in the advancement of science relates to “team science,” or in other words “group” collaborations. There have been extensive studies analyzing various statistical properties of collaborations of individual or pairs of authors. However, the number of studies pertaining to groups/teams of scientists working together is limited in number. In this paper, we set an objective to study the productivity of group collaborations where groups are represented as small substructures usually termed as network motifs in the literature. A preliminary observation is that star-like motifs have the largest productivity (defined as a function of citation count) followed by 4-cliques. We then introduce a bunch of features and study their individual relations with the productivity of a team. Building on these observations, we develop a supervised classification model that can automatically distinguish the highly productive teams from the low productive ones based on the set of identified features. The accuracy of the classification is 82% on an average for all the motifs with the accuracy reaching as high as 95% for 4-cliques. Finally, we present a detailed analysis of the time-transition behavior of different motifs along with some of the real world highly productive motifs found in our dataset. This empirical study is a first step toward the development of a full-fledged recommendation system that can predict how productive a team would be in the future.","PeriodicalId":143584,"journal":{"name":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM.2014.6921572","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

One of the key aspects instrumental in the advancement of science relates to “team science,” or in other words “group” collaborations. There have been extensive studies analyzing various statistical properties of collaborations of individual or pairs of authors. However, the number of studies pertaining to groups/teams of scientists working together is limited in number. In this paper, we set an objective to study the productivity of group collaborations where groups are represented as small substructures usually termed as network motifs in the literature. A preliminary observation is that star-like motifs have the largest productivity (defined as a function of citation count) followed by 4-cliques. We then introduce a bunch of features and study their individual relations with the productivity of a team. Building on these observations, we develop a supervised classification model that can automatically distinguish the highly productive teams from the low productive ones based on the set of identified features. The accuracy of the classification is 82% on an average for all the motifs with the accuracy reaching as high as 95% for 4-cliques. Finally, we present a detailed analysis of the time-transition behavior of different motifs along with some of the real world highly productive motifs found in our dataset. This empirical study is a first step toward the development of a full-fledged recommendation system that can predict how productive a team would be in the future.

查看原文本刊更多论文

基于基序分析的科学群体自动分类

促进科学进步的一个关键方面与“团队科学”有关，或者换句话说，是“团队”合作。已经有广泛的研究分析了个人或成对作者合作的各种统计特性。然而，与科学家小组/团队合作有关的研究数量有限。在本文中，我们设定了一个目标来研究群体合作的生产力，其中群体被表示为小的子结构，通常在文献中被称为网络基序。初步观察发现，星形基序的生产率最高(以引用数的函数来定义)，其次是4-cliques。然后我们引入一些特性，并研究它们与团队生产力的个别关系。在这些观察的基础上，我们开发了一个监督分类模型，该模型可以根据识别的特征集自动区分高生产力团队和低生产力团队。所有图案的分类准确率平均为82%，其中4-团的分类准确率高达95%。最后，我们详细分析了不同基序的时间转移行为，以及在我们的数据集中发现的一些现实世界中高效的基序。这项实证研究是开发一个成熟的推荐系统的第一步，该系统可以预测一个团队未来的生产效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014)

自引率

0.00%

发文量