{"title":"Discrete variational autoencoders BERT model-based transcranial focused ultrasound for Alzheimer's disease detection","authors":"Kaushika Reddy Thipparthy , Archana Kollu , Chaitanya Kulkarni , Ashit Kumar Dutta , Hardik Doshi , Aditya Kashyap , Kumari Priyanka Sinha , Suresh Babu Kondaveeti , Rupesh Gupta","doi":"10.1016/j.jneumeth.2025.110386","DOIUrl":null,"url":null,"abstract":"<div><h3>Research background</h3><div>Alzheimer's Disease (AD) is a neurodegenerative condition marked by symptoms including aphasia and diminished verbal fluency. Researchers have employed phonetic attributes, fluency, pauses, and various paralinguistic traits, or derived aspects from transcribed text, to identify Alzheimer's disease.</div></div><div><h3>Methods and methodology</h3><div>Nevertheless, conventional acoustic feature-based detection techniques are constrained in their ability to capture semantic information, and the process of transcribing speech into text is both time-consuming and labour-intensive. Non-invasive brain stimulation (NBS), encompassing methods such as transcranial magnetic stimulation (TMS) and Transcranial focused ultrasound (tFUS), has been investigated as a potential intervention to enhance cognitive functions and communication in Alzheimer's patients, demonstrating efficacy in modulating brain activity and promoting neuroplasticity. This research utilises Discrete Variational Autoencoders to transform speech into pseudo-phoneme sequences, subsequently applying the BERT (Bidirectional Encoder Representations from Transformers) model to analyse the relationships among these pseudo-phoneme sequences. This research proposes a tFUS-BERT model to encapsulate the linguistic representations of audio.</div></div><div><h3>Result analysis</h3><div>The proposed tFUS-BERT model demonstrated its effectiveness with an accuracy of 76.06 % when combined with Wav2vec 2.0 and 71.83 % with Hu-BERT, outperforming the baseline by 5.63 % on the ADReSSo dataset. Additionally, the model exhibited superior performance in capturing linguistic representations compared to traditional acoustic methods, showcasing its potential for accurate and scalable Alzheimer's detection.</div></div><div><h3>Comparison with previous studies</h3><div>The model attains an accuracy of 70.42 % on the ADReSSo (Alzheimer's Dementia Recognition through Spontaneous Speech Only) dataset, reflecting a 5.63 % enhancement compared to the baseline system.</div></div>","PeriodicalId":16415,"journal":{"name":"Journal of Neuroscience Methods","volume":"416 ","pages":"Article 110386"},"PeriodicalIF":2.7000,"publicationDate":"2025-02-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Neuroscience Methods","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165027025000275","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Research background
Alzheimer's Disease (AD) is a neurodegenerative condition marked by symptoms including aphasia and diminished verbal fluency. Researchers have employed phonetic attributes, fluency, pauses, and various paralinguistic traits, or derived aspects from transcribed text, to identify Alzheimer's disease.
Methods and methodology
Nevertheless, conventional acoustic feature-based detection techniques are constrained in their ability to capture semantic information, and the process of transcribing speech into text is both time-consuming and labour-intensive. Non-invasive brain stimulation (NBS), encompassing methods such as transcranial magnetic stimulation (TMS) and Transcranial focused ultrasound (tFUS), has been investigated as a potential intervention to enhance cognitive functions and communication in Alzheimer's patients, demonstrating efficacy in modulating brain activity and promoting neuroplasticity. This research utilises Discrete Variational Autoencoders to transform speech into pseudo-phoneme sequences, subsequently applying the BERT (Bidirectional Encoder Representations from Transformers) model to analyse the relationships among these pseudo-phoneme sequences. This research proposes a tFUS-BERT model to encapsulate the linguistic representations of audio.
Result analysis
The proposed tFUS-BERT model demonstrated its effectiveness with an accuracy of 76.06 % when combined with Wav2vec 2.0 and 71.83 % with Hu-BERT, outperforming the baseline by 5.63 % on the ADReSSo dataset. Additionally, the model exhibited superior performance in capturing linguistic representations compared to traditional acoustic methods, showcasing its potential for accurate and scalable Alzheimer's detection.
Comparison with previous studies
The model attains an accuracy of 70.42 % on the ADReSSo (Alzheimer's Dementia Recognition through Spontaneous Speech Only) dataset, reflecting a 5.63 % enhancement compared to the baseline system.
期刊介绍:
The Journal of Neuroscience Methods publishes papers that describe new methods that are specifically for neuroscience research conducted in invertebrates, vertebrates or in man. Major methodological improvements or important refinements of established neuroscience methods are also considered for publication. The Journal''s Scope includes all aspects of contemporary neuroscience research, including anatomical, behavioural, biochemical, cellular, computational, molecular, invasive and non-invasive imaging, optogenetic, and physiological research investigations.