Using Explainable AI to Understand Team Formation and Team Impact

Q3 Social Sciences

Proceedings of the Association for Information Science and Technology Pub Date : 2023-10-01 DOI:10.1002/pra2.804

Huimin Xu, Maytal Saar‐Tsechansky, Min Song, Ying Ding

{"title":"Using Explainable <scp>AI</scp> to Understand Team Formation and Team Impact","authors":"Huimin Xu, Maytal Saar‐Tsechansky, Min Song, Ying Ding","doi":"10.1002/pra2.804","DOIUrl":null,"url":null,"abstract":"ABSTRACT The citation of scientific papers is considered a simple and direct indicator of papers' impact. This paper predicts papers' citations through team‐related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members' previous citation number, H‐index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features' relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male‐dominated teams can bring higher team citations. Our project can provide insights into how to form the best scientific teams and maximize team impact from team composition and team structure.","PeriodicalId":37833,"journal":{"name":"Proceedings of the Association for Information Science and Technology","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Association for Information Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/pra2.804","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

ABSTRACT The citation of scientific papers is considered a simple and direct indicator of papers' impact. This paper predicts papers' citations through team‐related variables, team composition, and team structure. Team composition includes team size, male/female dominance, academia/industry collaboration, unique race number, and unique country number. Team structures are made up of team power level and team power hierarchy. Team members' previous citation number, H‐index, previous collaborators, career age, and previous paper numbers are a proxy of team power. We calculated the mean value and Gini coefficient to represent team power level (the collective team capability) and team power hierarchy (the vertical difference of power distribution within a team). Taking 1,675,035 CS teams in the DBLP dataset, we trained the XGBoost model to predict high/low citation. Our model has reached 0.71 in AUC and 70.45% in accuracy rate. Utilizing Explainable AI method SHAP to evaluate features' relative importance in predicting team citation categories, we found that team structure plays a more critical role than team composition in predicting team citation. High team power level, flat team power structure, diverse race background, large team, collaboration with industry, and male‐dominated teams can bring higher team citations. Our project can provide insights into how to form the best scientific teams and maximize team impact from team composition and team structure.

查看原文本刊更多论文

使用可解释的AI来理解团队形成和团队影响

科学论文的引用被认为是论文影响力的一个简单而直接的指标。本文通过团队相关变量、团队组成和团队结构来预测论文的引用。团队组成包括团队规模、男性/女性优势、学术界/行业合作、独特的种族编号和独特的国家编号。团队结构由团队权力层次和团队权力层次构成。团队成员以前的引用次数、H指数、以前的合作者、职业年龄和以前的论文数量是团队力量的代表。我们计算了均值和基尼系数来表示团队权力水平(团队集体能力)和团队权力层次(团队内部权力分布的垂直差异)。以DBLP数据集中的1,675,035个CS团队为例，我们训练了XGBoost模型来预测高/低引用。模型的AUC达到0.71，准确率达到70.45%。利用可解释人工智能方法SHAP评估特征在预测团队引用类别中的相对重要性，我们发现团队结构在预测团队引用方面比团队组成发挥更关键的作用。团队权力水平高、团队权力结构扁平化、种族背景多元化、团队规模大、与行业合作、男性主导的团队可以带来更高的团队引用率。我们的项目可以从团队组成和团队结构中提供如何组建最好的科学团队和最大化团队影响的见解。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊