S. Ernala, M. Birnbaum, Kristin A. Candan, Asra F. Rizvi, W. A. Sterling, J. Kane, M. Choudhury
{"title":"Methodological Gaps in Predicting Mental Health States from Social Media: Triangulating Diagnostic Signals","authors":"S. Ernala, M. Birnbaum, Kristin A. Candan, Asra F. Rizvi, W. A. Sterling, J. Kane, M. Choudhury","doi":"10.1145/3290605.3300364","DOIUrl":null,"url":null,"abstract":"A growing body of research is combining social media data with machine learning to predict mental health states of individuals. An implication of this research lies in informing evidence-based diagnosis and treatment. However, obtaining clinically valid diagnostic information from sensitive patient populations is challenging. Consequently, researchers have operationalized characteristic online behaviors as \"proxy diagnostic signals\" for building these models. This paper posits a challenge in using these diagnostic signals, purported to support clinical decision-making. Focusing on three commonly used proxy diagnostic signals derived from social media, we find that predictive models built on these data, although offer strong internal validity, suffer from poor external validity when tested on mental health patients. A deeper dive reveals issues of population and sampling bias, as well as of uncertainty in construct validity inherent in these proxies. We discuss the methodological and clinical implications of these gaps and provide remedial guidelines for future research.","PeriodicalId":20454,"journal":{"name":"Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2019-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"93","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3290605.3300364","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 93
Abstract
A growing body of research is combining social media data with machine learning to predict mental health states of individuals. An implication of this research lies in informing evidence-based diagnosis and treatment. However, obtaining clinically valid diagnostic information from sensitive patient populations is challenging. Consequently, researchers have operationalized characteristic online behaviors as "proxy diagnostic signals" for building these models. This paper posits a challenge in using these diagnostic signals, purported to support clinical decision-making. Focusing on three commonly used proxy diagnostic signals derived from social media, we find that predictive models built on these data, although offer strong internal validity, suffer from poor external validity when tested on mental health patients. A deeper dive reveals issues of population and sampling bias, as well as of uncertainty in construct validity inherent in these proxies. We discuss the methodological and clinical implications of these gaps and provide remedial guidelines for future research.