{"title":"Learning machine learning: On the political economy of big tech's online AI courses","authors":"Inga Luchs, C. Apprich, M. Broersma","doi":"10.1177/20539517231153806","DOIUrl":"https://doi.org/10.1177/20539517231153806","url":null,"abstract":"Machine learning (ML) algorithms are still a novel research object in the field of media studies. While existing research focuses on concrete software on the one hand and the socio-economic context of the development and use of these systems on the other, this paper studies online ML courses as a research object that has received little attention so far. By pursuing a walkthrough and critical discourse analysis of Google's Machine Learning Crash Course and IBM's introductory course to Machine Learning with Python, we not only shed light on the technical knowledge, assumptions, and dominant infrastructures of ML as a field of practice, but also on the economic interests of the companies providing the courses. We demonstrate how the online courses further support Google and IBM to consolidate and even expand their position of power by recruiting new AI talent and by securing their infrastructures and models to become the dominant ones. Further, we show how the companies not only influence greatly how ML is represented, but also how these representations in turn influence and direct current ML research and development, as well as the societal effects of their products. Here, they boast an image of fair and democratic artificial intelligence, which stands in stark contrast to the ubiquity of their corporate products and the advertised directives of efficiency and performativity the companies strive for. This underlines the need for alternative infrastructures and perspectives.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41796643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella
{"title":"Machine learning, meaning making: On reading computer science texts","authors":"Louise Amoore, Alexander Campolo, Benjamin N. Jacobsen, Ludovico Rella","doi":"10.1177/20539517231166887","DOIUrl":"https://doi.org/10.1177/20539517231166887","url":null,"abstract":"Computer science tends to foreclose the reading of its texts by social science and humanities scholars – via code and scale, mathematics, black box opacities, secret or proprietary models. Yet, when computer science papers are read in order to better understand what machine learning means for societies, a form of reading is brought to bear that is not primarily about excavating the hidden meaning of a text or exposing underlying truths about science. Not strictly reading to make sense or to discern definitive meaning of computer science texts, reading is an engagement with the sense-making and meaning-making that takes place. We propose a strategy for reading computer science that is attentive to the act of reading itself, that stays close to the difficulty involved in all forms of reading, and that works with the text as already properly belonging to the ethico-politics that this difficulty engenders. Addressing a series of three “reading problems” – genre, readability, and meaning – we discuss machine learning textbooks and papers as sites where today's algorithmic models are actively giving accounts of their paradigmatic worldview. Much more than matters of technical definition or proof of concept, texts are sites where concepts are forged and contested. In our times, when the political application of AI and machine learning is so commonly geared to settle or predict difficult societal problems in advance, a reading strategy must open the gaps and difficulties of that which cannot be settled or resolved.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48636165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos
{"title":"Formally comparing topic models and human-generated qualitative coding of physician mothers’ experiences of workplace discrimination","authors":"Adam S. Miner, Sheridan A Stewart, M. Halley, Laura K. Nelson, Eleni Linos","doi":"10.1177/20539517221149106","DOIUrl":"https://doi.org/10.1177/20539517221149106","url":null,"abstract":"Differences between computationally generated and human-generated themes in unstructured text are important to understand yet difficult to assess formally. In this study, we bridge these approaches through two contributions. First, we formally compare a primarily computational approach, topic modeling, to a primarily human-driven approach, qualitative thematic coding, in an impactful context: physician mothers’ experience of workplace discrimination. Second, we compare our chosen topic model to a principled alternative topic model to make explicit study design decisions meriting consideration in future research. By formally contrasting computationally generated (i.e. topic modeling) and human-generated (i.e. thematic coding) knowledge, we shed light on issues of interest to several audiences, notably computational social scientists who wish to understand study design tradeoffs, and qualitative researchers who may wish to leverage computational methods to improve the speed and reproducibility of labor-intensive coding. Although useful in other domains, we highlight the value of fast, reproducible methods to better understand experiences of workplace discrimination.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44128155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Based and confused: Tracing the political connotations of a memetic phrase across the web","authors":"S. Hagen, D. de Zeeuw","doi":"10.1177/20539517231163175","DOIUrl":"https://doi.org/10.1177/20539517231163175","url":null,"abstract":"Current research on the weaponisation of far-right discourse online has mostly focused on the dangers of normalising hate speech. However, this often operates on questionable assumptions about how far-right terms retain problematic meanings over time and across different platforms. Yet contextual meaning-change, we argue, is key to assessing the normalisation of problematic but fuzzy terms as they spread across the Web. To redress this, our article traces the changing meaning of the term based, a word that was appropriated from Black Twitter to become a staple of online far-right slang in the mid-2010s. Through a quali-quantitative cross-platform approach, we analyse the evolution of the term between 2010 and 2021 on Twitter, Reddit and 4chan. We find that while the far right meaning of based partially survived, its meaning changed and was rendered diffuse as it was adopted by other communities, afforded by a repurposable kernel of meaning to based as ‘not caring about what other people think’ and ‘being true to yourself’ to which different (political) connotations were attached. This challenges the understanding of far-right memes and hate speech as carrying a single and persistent problematic message, and instead emphasises their varied meanings and subcultural functions within specific online communities.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42830558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Stevens, P. Fussey, Daragh Murray, Kuda Hove, Otto Saki
{"title":"‘I started seeing shadows everywhere’: The diverse chilling effects of surveillance in Zimbabwe","authors":"A. Stevens, P. Fussey, Daragh Murray, Kuda Hove, Otto Saki","doi":"10.1177/20539517231158631","DOIUrl":"https://doi.org/10.1177/20539517231158631","url":null,"abstract":"Recent years have witnessed growing ubiquity and potency of state surveillance measures with heightened implications for human rights and social justice. While impacts of surveillance are routinely framed through ‘privacy’ narratives, emphasising ‘chilling effects’ surfaces a more complex range of harms and rights implications for those who are, or believe they are, subjected to surveillance. Although first emphasised during the McCarthy era, surveillance ‘chilling effects’ remain under-researched, particularly in Africa. Drawing on rare interview data from participants subjected to state-sponsored surveillance in Zimbabwe, the paper reveals complex assemblages of state and non-state actors involved in diverse and expansive hybrid online–offline monitoring. While scholarship has recently emphasised the importance of large-scale digital mass surveillance, the Zimbabwean context reveals complex assemblages of ‘big data’, social media and other digital monitoring combining with more traditional human surveillance practices. Such inseparable online–offline imbrications compound the scale, scope and impact of surveillance and invite analyses as an integrated ensemble. The paper evidences how these surveillance activities exert chilling effects that vary in form, scope and intensity, and implicate rights essential to the development of personal identity and effective functioning of participatory democracy. Moreover, the data reveals impacts beyond the individual to the vicarious and collective. These include gendered dimensions, eroded interpersonal trust and the depleted ability of human rights defenders to organise and particulate in democratic processes. Overall, surveillance chilling effects exert a wide spectrum of outcomes which consequently interfere with enjoyment of multiple rights and hold both short- and long-term implications for democratic participation.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46180129","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clicks and particulates: Value, alienation, and attunement as unifying themes in big data studies","authors":"G. Ottinger, K. Bronson, D. Nafus","doi":"10.1177/20539517231184891","DOIUrl":"https://doi.org/10.1177/20539517231184891","url":null,"abstract":"Critiques of data colonialism and surveillance capitalism focus on data collected from online behavior. We propose that analytical concepts from these critiques—namely, regimes of value and patterns of alienation and attunement—could be applied more widely to better understand the threats that datafication poses to equity and democracy in the social and environmental realms. Regimes of value, which include the institutions and technologies that make data meaningful and render them selectively available for appropriation, are relevant both to for-profit companies’ data practices and to states’ participation in the datafication of the environment; examining regimes of value raises questions about how data are exploited and how they are neglected. Patterns of alienation associated with datafication include the potential for alienation from the environment; however, at least in some value regimes, alienation may be accompanied by possibilities for attunement to natural and social phenomena that might otherwise have escaped notice.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46584764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FAIR data sharing: An international perspective on why medical researchers are lagging behind","authors":"L. Rainey, J. Lutomski, M. Broeders","doi":"10.1177/20539517231171052","DOIUrl":"https://doi.org/10.1177/20539517231171052","url":null,"abstract":"FAIR data, that is, Findable, Accessible, Interoperable, and Reusable data, and Big Data intersect across issues related to data storage, access, and processing. The solution-oriented FAIR principles serve an integral role in improving Big Data; yet to date, the implementation of FAIR in multiple sectors has been fragmented. We conducted an exploratory analysis to identify incentives and barriers in creating FAIR data in the medical sector using digital concept mapping, a systematic mixed methods approach. Thirty-eight principal investigators (PIs) were recruited from North America, Europe, and Oceania. Our analysis revealed five clusters rated according to perceived relevance: ‘Efficiency and collaboration’ (rating 7.23), ‘Privacy and security’ (rating 7.18), ‘Data management standards’ (rating 7.16), ‘Organization of services’ (rating 6.98), and ‘Ownership’ (rating 6.28). All five clusters scored relatively high and within a narrow range (i.e., 6.28–7.69), implying that each cluster likely influences researchers’ decision-making processes. PIs harbor a positive view of FAIR data sharing, as exemplified by participants highly prioritizing ‘Efficiency and collaboration’. However, the other four clusters received only modestly lower ratings and largely contained barriers to FAIR data sharing. When viewed collectively, the benefits of efficiency and collaboration may not be sufficient in propelling FAIR data sharing. Arguably, until more of these reported barriers are addressed, widespread support of FAIR data will not translate into widespread practice. This research lays the preliminary foundation for conducting targeted large-scale research into FAIR data practices in the medical research community.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45542784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data arenas: The relational dynamics of data activism","authors":"Bartosz Ślosarski","doi":"10.1177/20539517231177617","DOIUrl":"https://doi.org/10.1177/20539517231177617","url":null,"abstract":"The article proposes the theoretical category of data arenas as a relational field for strategic actors in diverse areas of the contentious politics of data (Beraldo and Milan, 2019). The paper argues that the conceptualization of data activism needs to be related to the immediate data arena in which the action takes place, in order to select the interactive opportunities and threats for emerging data-driven repertoires of action. To fully work through the relational dynamics of data activism, it is necessary to move from a conceptualization of data infrastructure to the notion of data arenas as an ‘open-ended bundle of rules and resources that allows certain kinds of interaction to proceed’ (Jasper, 2006: 141). Using the case of environmental data activism, I highlight four key dimensions to study: (a) strategic use of data as capital that differentiates and positions actors, as well as influences their further choices; (b) practices of defining the boundaries of the problem on which the arena focuses and outlining the pool of actors who participate in the process of solving it; (3) sets of relationships among the outlined pool of actors which represent opportunities and threats for the actors, related to the position they occupy within an arena; and (4) power as the ability to control and shape an arena. Data arena approach shed new light on data activism as a relational practice, combining the latest developments in research on data contexts and the political situatedness of data with the emerging field of research on data activism.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41726502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital identity as platform-mediated surveillance","authors":"S. Masiero","doi":"10.1177/20539517221135176","DOIUrl":"https://doi.org/10.1177/20539517221135176","url":null,"abstract":"Digital identity systems are usually viewed as datafiers of existing populations. Yet a platform view finds limited space in the digital identity discourse, with the result that the platform features of digital identity systems are not seen in relation to their surveillance outcomes. In this commentary I illuminate how the core platform properties of digital identity systems afford the undue surveillance of vulnerable groups, leading users into the binary condition of either registering and being profiled, or giving up essential benefits from providers of development programmes. By doing so I contest the “dark side” narrative often applied to digital identity, arguing that, rather than just a side, it is the very inner matter of digital identity platforms that enables surveillance outcomes.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43817463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The ethical dimensions of Google autocomplete","authors":"Rosie Graham","doi":"10.1177/20539517231156518","DOIUrl":"https://doi.org/10.1177/20539517231156518","url":null,"abstract":"What questions should we ask of Google’s Autocomplete suggestions? This article highlights some of the key ethical issues raised by Google’s automated suggestion tool that provides potential queries below a user’s search box. Much of the discourse surrounding Google’s suggestions has been framed through legal cases in which complex issues can become distilled into black-and-white questions of the law. For example, do Google have to remove a particular suggestion and do they have to pay a settlement for damages? This commentary argues that shaping this discourse along primarily legal lines obscures many of these other moral dimensions raised by Google Autocomplete. Building from existing typologies, this commentary first outlines the legal discourse before exploring five additional ethical challenges, each framed around a particular moral question in which all users have a stake. Written in the form of a commentary, the purpose of this article is not to conclusively answer the ethical questions raised, but rather to give an account of why these particular questions are worth debating. Autocomplete’s suggestions are not simply a mirror of what users are typing into Google’s search bar. Google’s official statement is that “Autocomplete is a time-saving but complex feature. It doesn’t simply display the most common queries on a given topic” but “also predict[s] individual words and phrases that are based on both real searches as well as word patterns found across the web” (Google, 2022). Both its underlying methods and associated terminology have changed throughout time, shifting between providing completions, suggestions, and predictions. In doing so, the grounds for potential critique are ever-changing, which means that Google’s approach to Autocomplete deserves significant scrutiny.","PeriodicalId":47834,"journal":{"name":"Big Data & Society","volume":" ","pages":""},"PeriodicalIF":8.5,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47412208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"社会学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}