An analysis of design process and performance in distributed data science teams

Team Performance Management: An International Journal Pub Date : 2019-06-05 DOI:10.31224/osf.io/fwrqj

Christopher McComb, J. Defranco, Torsten Maier

{"title":"An analysis of design process and performance in distributed data science teams","authors":"Christopher McComb, J. Defranco, Torsten Maier","doi":"10.31224/osf.io/fwrqj","DOIUrl":null,"url":null,"abstract":"PurposeOften, it is assumed that teams are better at solving problems than individuals working independently. However, recent work in engineering, design and psychology contradicts this assumption. This study aims to examine the behavior of teams engaged in data science competitions. Crowdsourced competitions have seen increased use for software development and data science, and platforms often encourage teamwork between participants.Design/methodology/approachWe specifically examine the teams participating in data science competitions hosted by Kaggle. We analyze the data provided by Kaggle to compare the effect of team size and interaction frequency on team performance. We also contextualize these results through a semantic analysis.FindingsThis work demonstrates that groups of individuals working independently may outperform interacting teams on average, but that small, interacting teams are more likely to win competitions. The semantic analysis revealed differences in forum participation, verb usage and pronoun usage when comparing top- and bottom-performing teams.Research limitations/implicationsThese results reveal a perplexing tension that must be explored further: true teams may experience better performance with higher cohesion, but nominal teams may perform even better on average with essentially no cohesion. Limitations of this research include not factoring in team member experience level and reliance on extant data.Originality/valueThese results are potentially of use to designers of crowdsourced data science competitions as well as managers and contributors to distributed software development projects.","PeriodicalId":150524,"journal":{"name":"Team Performance Management: An International Journal","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Team Performance Management: An International Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31224/osf.io/fwrqj","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

PurposeOften, it is assumed that teams are better at solving problems than individuals working independently. However, recent work in engineering, design and psychology contradicts this assumption. This study aims to examine the behavior of teams engaged in data science competitions. Crowdsourced competitions have seen increased use for software development and data science, and platforms often encourage teamwork between participants.Design/methodology/approachWe specifically examine the teams participating in data science competitions hosted by Kaggle. We analyze the data provided by Kaggle to compare the effect of team size and interaction frequency on team performance. We also contextualize these results through a semantic analysis.FindingsThis work demonstrates that groups of individuals working independently may outperform interacting teams on average, but that small, interacting teams are more likely to win competitions. The semantic analysis revealed differences in forum participation, verb usage and pronoun usage when comparing top- and bottom-performing teams.Research limitations/implicationsThese results reveal a perplexing tension that must be explored further: true teams may experience better performance with higher cohesion, but nominal teams may perform even better on average with essentially no cohesion. Limitations of this research include not factoring in team member experience level and reliance on extant data.Originality/valueThese results are potentially of use to designers of crowdsourced data science competitions as well as managers and contributors to distributed software development projects.

查看原文本刊更多论文

分布式数据科学团队的设计过程和性能分析

通常，人们认为团队比独立工作的个人更擅长解决问题。然而，最近在工程、设计和心理学方面的研究与这一假设相矛盾。本研究旨在考察参与数据科学竞赛的团队的行为。众包竞赛越来越多地应用于软件开发和数据科学，平台通常鼓励参与者之间的团队合作。设计/方法/方法我们特别考察了参加由Kaggle主办的数据科学竞赛的团队。我们分析了Kaggle提供的数据，比较了团队规模和互动频率对团队绩效的影响。我们还通过语义分析将这些结果语境化。这项研究表明，平均而言，独立工作的个人团队可能比互动团队表现得更好，但小型互动团队更有可能赢得比赛。语义分析显示，在比较表现最好和最差的团队时，论坛参与度、动词使用和代词使用都存在差异。这些结果揭示了一个令人困惑的矛盾，必须进一步探索:真正的团队可能会有更高的凝聚力，但名义上的团队可能会在基本上没有凝聚力的情况下表现得更好。本研究的局限性包括没有考虑团队成员的经验水平和对现有数据的依赖。原创性/价值这些结果对众包数据科学竞赛的设计者以及分布式软件开发项目的管理者和贡献者都有潜在的用处。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Team Performance Management: An International Journal

自引率

0.00%

发文量