使用机器学习创建西班牙语语料库以开发数字痴呆症生物标志物

2022 IEEE Mexican International Conference on Computer Science (ENC) Pub Date : 2022-08-24 DOI:10.1109/ENC56672.2022.9882903

L. Cabrera-Leyva, Jesús Favela Vara, Dagoberto Cruz-Sandoval, Diana Leticia Paniagua Santos, Maricruz Huerta Jauregui

{"title":"使用机器学习创建西班牙语语料库以开发数字痴呆症生物标志物","authors":"L. Cabrera-Leyva, Jesús Favela Vara, Dagoberto Cruz-Sandoval, Diana Leticia Paniagua Santos, Maricruz Huerta Jauregui","doi":"10.1109/ENC56672.2022.9882903","DOIUrl":null,"url":null,"abstract":"Dementia is one of the most prevalent diseases affecting older adults in Mexico. There has been increasing interest in the development of digital biomarkers of dementia based on the analysis of speech. The availability of high-quality speech corpus is important to advance this line of research. However, there are no publicly available dataset in Spanish for this purpose. Therefore, we describe a protocol to capture Spanish audio from older adults for dementia research. We describe the lessons learned and adjustments to the protocol that emerged from a pilot study.","PeriodicalId":145622,"journal":{"name":"2022 IEEE Mexican International Conference on Computer Science (ENC)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Creating a Spanish Speech Corpus to Develop Digital Dementia Biomarkers Using Machine Learning\",\"authors\":\"L. Cabrera-Leyva, Jesús Favela Vara, Dagoberto Cruz-Sandoval, Diana Leticia Paniagua Santos, Maricruz Huerta Jauregui\",\"doi\":\"10.1109/ENC56672.2022.9882903\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Dementia is one of the most prevalent diseases affecting older adults in Mexico. There has been increasing interest in the development of digital biomarkers of dementia based on the analysis of speech. The availability of high-quality speech corpus is important to advance this line of research. However, there are no publicly available dataset in Spanish for this purpose. Therefore, we describe a protocol to capture Spanish audio from older adults for dementia research. We describe the lessons learned and adjustments to the protocol that emerged from a pilot study.\",\"PeriodicalId\":145622,\"journal\":{\"name\":\"2022 IEEE Mexican International Conference on Computer Science (ENC)\",\"volume\":\"99 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-08-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE Mexican International Conference on Computer Science (ENC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ENC56672.2022.9882903\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Mexican International Conference on Computer Science (ENC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ENC56672.2022.9882903","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

痴呆症是影响墨西哥老年人的最普遍疾病之一。基于语言分析的痴呆症数字生物标志物的开发越来越受到关注。高质量语音语料库的可用性对于推进这方面的研究非常重要。然而，没有西班牙语的公开可用数据集用于此目的。因此，我们描述了一种从老年人中获取西班牙语音频用于痴呆症研究的方案。我们描述了从试点研究中获得的经验教训和对方案的调整。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Creating a Spanish Speech Corpus to Develop Digital Dementia Biomarkers Using Machine Learning

Dementia is one of the most prevalent diseases affecting older adults in Mexico. There has been increasing interest in the development of digital biomarkers of dementia based on the analysis of speech. The availability of high-quality speech corpus is important to advance this line of research. However, there are no publicly available dataset in Spanish for this purpose. Therefore, we describe a protocol to capture Spanish audio from older adults for dementia research. We describe the lessons learned and adjustments to the protocol that emerged from a pilot study.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE Mexican International Conference on Computer Science (ENC)

自引率

0.00%

发文量