Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative.
Zarina R Bilgrami, Eduardo Castro, Carla Agurto, Einat Liebenthal, Michaela Ennis, Justin T Baker, Isabelle Scott, Beau-Luke Colton, Kang Ik K Cho, Linying Li, Zailyn Tamayo, Mara Henecks, Habiballah Rahimi Eichi, Tae'lar Henry, Jean Addington, Luis K Alameda, Celso Arango, Nicholas J K Breitborde, Matthew R Broome, Kristin S Cadenhead, Monica E Calkins, Eric Yu Hai Chen, Jimmy Choi, Philippe Conus, Barbara A Cornblatt, Lauren M Ellman, Paolo Fusar-Poli, Pablo A Gaspar, Carla Gerber, Louise Birkedal Glenthøj, Leslie E Horton, Christy Hui, Joseph Kambeitz, Lana Kambeitz-Ilankovic, Matcheri S Keshavan, Sung-Wan Kim, Nikolaos Koutsouleris, Jun Soo Kwon, Kerstin Langbein, Daniel Mamah, Covadonga M Diaz-Caneja, Daniel H Mathalon, Vijay A Mittal, Merete Nordentoft, Godfrey D Pearlson, Jesus Perez, Diana O Perkins, Albert R Powers, Jack Rogers, Fred W Sabb, Jason Schiffman, Jai L Shah, Steven M Silverstein, Stefan Smesny, William S Stone, Walid Yassin, Gregory P Strauss, Judy L Thompson, Rachel Upthegrove, Swapna Verma, Jijun Wang, Daniel H Wolf, Patrick D McGorry, Rene S Kahn, John M Kane, Alan Anticevic, Carrie E Bearden, Dominic Dwyer, Tashrif Billah, Sylvain Bouix, Ofer Pasternak, Martha E Shenton, Scott W Woods, Barnaby Nelson, Guillermo A Cecchi, Cheryl M Corcoran, Phillip M Wolff
{"title":"Collecting language, speech acoustics, and facial expression to predict psychosis and other clinical outcomes: strategies from the AMP® SCZ initiative.","authors":"Zarina R Bilgrami, Eduardo Castro, Carla Agurto, Einat Liebenthal, Michaela Ennis, Justin T Baker, Isabelle Scott, Beau-Luke Colton, Kang Ik K Cho, Linying Li, Zailyn Tamayo, Mara Henecks, Habiballah Rahimi Eichi, Tae'lar Henry, Jean Addington, Luis K Alameda, Celso Arango, Nicholas J K Breitborde, Matthew R Broome, Kristin S Cadenhead, Monica E Calkins, Eric Yu Hai Chen, Jimmy Choi, Philippe Conus, Barbara A Cornblatt, Lauren M Ellman, Paolo Fusar-Poli, Pablo A Gaspar, Carla Gerber, Louise Birkedal Glenthøj, Leslie E Horton, Christy Hui, Joseph Kambeitz, Lana Kambeitz-Ilankovic, Matcheri S Keshavan, Sung-Wan Kim, Nikolaos Koutsouleris, Jun Soo Kwon, Kerstin Langbein, Daniel Mamah, Covadonga M Diaz-Caneja, Daniel H Mathalon, Vijay A Mittal, Merete Nordentoft, Godfrey D Pearlson, Jesus Perez, Diana O Perkins, Albert R Powers, Jack Rogers, Fred W Sabb, Jason Schiffman, Jai L Shah, Steven M Silverstein, Stefan Smesny, William S Stone, Walid Yassin, Gregory P Strauss, Judy L Thompson, Rachel Upthegrove, Swapna Verma, Jijun Wang, Daniel H Wolf, Patrick D McGorry, Rene S Kahn, John M Kane, Alan Anticevic, Carrie E Bearden, Dominic Dwyer, Tashrif Billah, Sylvain Bouix, Ofer Pasternak, Martha E Shenton, Scott W Woods, Barnaby Nelson, Guillermo A Cecchi, Cheryl M Corcoran, Phillip M Wolff","doi":"10.1038/s41537-025-00669-z","DOIUrl":null,"url":null,"abstract":"<p><p>Speech-based detection of early psychosis is progressing at a rapid pace. Within this evolving field, the Accelerating Medicines Partnership® in Schizophrenia (AMP® SCZ) is uniquely positioned to deepen our understanding of how language and related behaviors reflect early psychosis. We begin with detailed standard operating procedures (SOPs) that govern every stage of collection. These SOPs specify how to elicit speech, capture facial expressions, and record acoustics in synchronized audio-video files-both on-site and through remote platforms. We then explain how we chose our sampling tasks, hardware, and software, and how we built streamlined pipelines for data acquisition, aggregation, and processing. Robust quality-assurance and quality-control (QA/QC) routines, along with standardized interviewer training and certification, ensure data integrity across sites. Using natural language processing parsers, large language models, and machine-learning classifiers, we analyzed Data Release 3.0 to uncover systematic grammatical markers of psychosis risk. Speakers at clinical high risk (CHR) produced more referential language but fewer adjectives, adverbs, and nouns than community controls (CC), a pattern that replicated across sampling tasks. Some effects were task-specific: CHR participants showed elevated use of complex syntactic embeddings in two elicitation conditions but not the third, underscoring the importance of the language sampling task. Together, these results demonstrate how computational linguistics can turn everyday speech into a scalable, objective biomarker, paving the way for earlier and more precise detection of psychosis.Video Link: https://vimeo.com/1112291965?fl=pl&fe=sh.</p>","PeriodicalId":74758,"journal":{"name":"Schizophrenia (Heidelberg, Germany)","volume":"11 1","pages":"125"},"PeriodicalIF":4.1000,"publicationDate":"2025-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12528486/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Schizophrenia (Heidelberg, Germany)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1038/s41537-025-00669-z","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"PSYCHIATRY","Score":null,"Total":0}
引用次数: 0
Abstract
Speech-based detection of early psychosis is progressing at a rapid pace. Within this evolving field, the Accelerating Medicines Partnership® in Schizophrenia (AMP® SCZ) is uniquely positioned to deepen our understanding of how language and related behaviors reflect early psychosis. We begin with detailed standard operating procedures (SOPs) that govern every stage of collection. These SOPs specify how to elicit speech, capture facial expressions, and record acoustics in synchronized audio-video files-both on-site and through remote platforms. We then explain how we chose our sampling tasks, hardware, and software, and how we built streamlined pipelines for data acquisition, aggregation, and processing. Robust quality-assurance and quality-control (QA/QC) routines, along with standardized interviewer training and certification, ensure data integrity across sites. Using natural language processing parsers, large language models, and machine-learning classifiers, we analyzed Data Release 3.0 to uncover systematic grammatical markers of psychosis risk. Speakers at clinical high risk (CHR) produced more referential language but fewer adjectives, adverbs, and nouns than community controls (CC), a pattern that replicated across sampling tasks. Some effects were task-specific: CHR participants showed elevated use of complex syntactic embeddings in two elicitation conditions but not the third, underscoring the importance of the language sampling task. Together, these results demonstrate how computational linguistics can turn everyday speech into a scalable, objective biomarker, paving the way for earlier and more precise detection of psychosis.Video Link: https://vimeo.com/1112291965?fl=pl&fe=sh.