{"title":"LOGLformer: Integrating local and global characteristics for depression scale estimation from facial expressions.","authors":"Cui Cao, Lang He","doi":"10.1063/5.0231737","DOIUrl":null,"url":null,"abstract":"<p><p>According to a publication by the World Health Organization, depression is projected to emerge as the leading mental health issue. In the domain of affective computing, deep learning techniques are frequently employed to represent facial dynamics using both local and global perspectives for the purpose of automatic depression detection (ADD). Yet, current models overlook the crucial interplay between local and global dynamics in discerning the significant features essential for ADD. Addressing this oversight, a novel hybrid computational architecture, named LOGLFormer, has been introduced. This architecture integrates CNN-derived local attributes and transformer-sourced global patterns tailored for ADD. Within LOGLFormer, the design philosophies of ResNet and ViT inspire the CNN and transformer branches, respectively. The synergy of these branches encompasses local convolutional mechanisms, self-attention strategies, and multilayer perceptron entities. Furthermore, the intricacies arising from disparities in CNN and transformer feature sets are reconciled through the specially devised feature alignment module. Rigorous comparative analysis underscores the distinctive efficacy of the LOGLFormer in recognizing depression, notably outperforming several state-of-the-art techniques on two dedicated depression databases: AVEC2013 and AVEC2014. Code will be available at https://github.com/helang818/LOGLFormer.</p>","PeriodicalId":21111,"journal":{"name":"Review of Scientific Instruments","volume":"96 3","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Review of Scientific Instruments","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1063/5.0231737","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"INSTRUMENTS & INSTRUMENTATION","Score":null,"Total":0}
引用次数: 0
Abstract
According to a publication by the World Health Organization, depression is projected to emerge as the leading mental health issue. In the domain of affective computing, deep learning techniques are frequently employed to represent facial dynamics using both local and global perspectives for the purpose of automatic depression detection (ADD). Yet, current models overlook the crucial interplay between local and global dynamics in discerning the significant features essential for ADD. Addressing this oversight, a novel hybrid computational architecture, named LOGLFormer, has been introduced. This architecture integrates CNN-derived local attributes and transformer-sourced global patterns tailored for ADD. Within LOGLFormer, the design philosophies of ResNet and ViT inspire the CNN and transformer branches, respectively. The synergy of these branches encompasses local convolutional mechanisms, self-attention strategies, and multilayer perceptron entities. Furthermore, the intricacies arising from disparities in CNN and transformer feature sets are reconciled through the specially devised feature alignment module. Rigorous comparative analysis underscores the distinctive efficacy of the LOGLFormer in recognizing depression, notably outperforming several state-of-the-art techniques on two dedicated depression databases: AVEC2013 and AVEC2014. Code will be available at https://github.com/helang818/LOGLFormer.
期刊介绍:
Review of Scientific Instruments, is committed to the publication of advances in scientific instruments, apparatuses, and techniques. RSI seeks to meet the needs of engineers and scientists in physics, chemistry, and the life sciences.