{"title":"Using generative AI for large-scale qualitative analysis of social media posts to understand why people leave computer science","authors":"Amanda Ross, Andrew Katz","doi":"10.1002/jee.70036","DOIUrl":null,"url":null,"abstract":"<div>\n \n \n <section>\n \n <h3> Background</h3>\n \n <p>Computer science faces a persistent attrition problem, with people leaving the field at a rate that exceeds new entrants. Given the increasing demand for computing jobs, it is essential to focus on reducing the number of individuals exiting the field.</p>\n </section>\n \n <section>\n \n <h3> Purpose</h3>\n \n <p>This study investigates why individuals leave the computer science field across various stages and contexts, addressing two questions: (1) What are the reasons for leaving? (2) What external factors influence these decisions?</p>\n </section>\n \n <section>\n \n <h3> Method</h3>\n \n <p>This large-scale qualitative study collected over 10,000 Reddit posts using keyword-based scraping. Using generative AI, we refined the dataset, filtering it down to 263 relevant posts. Generative AI was then used for thematic analysis on this subset of posts, utilizing the established GATOS method. We extend this approach by integrating a human-in-the-loop process to contextualize the identified themes within social cognitive career theory.</p>\n </section>\n \n <section>\n \n <h3> Results</h3>\n \n <p>Findings reveal diverse reasons for leaving, including job dissatisfaction, interests in other fields, psychological factors, academic challenges, health concerns, and industry issues. Influential factors include background, transition requirements, alternative field characteristics, and personal circumstances. Although the extent varied, all of these reasons and factors were observed at every departure stage.</p>\n </section>\n \n <section>\n \n <h3> Conclusions</h3>\n \n <p>These findings provide important insights that can help inform industry and academic policies and practices. Additionally, we contribute to the development of more efficient, scalable workflows for future qualitative research using generative AI.</p>\n </section>\n </div>","PeriodicalId":50206,"journal":{"name":"Journal of Engineering Education","volume":"114 4","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jee.70036","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Engineering Education","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jee.70036","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"EDUCATION & EDUCATIONAL RESEARCH","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Computer science faces a persistent attrition problem, with people leaving the field at a rate that exceeds new entrants. Given the increasing demand for computing jobs, it is essential to focus on reducing the number of individuals exiting the field.
Purpose
This study investigates why individuals leave the computer science field across various stages and contexts, addressing two questions: (1) What are the reasons for leaving? (2) What external factors influence these decisions?
Method
This large-scale qualitative study collected over 10,000 Reddit posts using keyword-based scraping. Using generative AI, we refined the dataset, filtering it down to 263 relevant posts. Generative AI was then used for thematic analysis on this subset of posts, utilizing the established GATOS method. We extend this approach by integrating a human-in-the-loop process to contextualize the identified themes within social cognitive career theory.
Results
Findings reveal diverse reasons for leaving, including job dissatisfaction, interests in other fields, psychological factors, academic challenges, health concerns, and industry issues. Influential factors include background, transition requirements, alternative field characteristics, and personal circumstances. Although the extent varied, all of these reasons and factors were observed at every departure stage.
Conclusions
These findings provide important insights that can help inform industry and academic policies and practices. Additionally, we contribute to the development of more efficient, scalable workflows for future qualitative research using generative AI.