“Timbre” Pilot Study Conducted Using Training & Validation Data Provisioned by UCSF R2D2 for Screening of Pulmonary Tuberculosis Using Cough (Acoustic Sounds), Clinical & Demographic Inputs

R. Pathri, Shekhar Jha
{"title":"“Timbre” Pilot Study Conducted Using Training & Validation Data Provisioned by UCSF R2D2 for Screening of Pulmonary Tuberculosis Using Cough (Acoustic Sounds), Clinical & Demographic Inputs","authors":"R. Pathri, Shekhar Jha","doi":"10.47363/jprr/2023(5)144","DOIUrl":null,"url":null,"abstract":"TimBre from Docturnal offers multidirectional screening of Lung Ailments – Pulmonary Tuberculosis, Pneumonia, Covid19 & COPD. Detailed studies of TimBre in the past used third party Microphone Array that focused on a XY arrangement that provided high fidelity cough sounds with an average length of >5 seconds and real-time demographic data such as Height, Weight, BMI [1]. In the current study, cough sounds were harvested from 7 different countries (India, Vietnam, Philippines, Uganda, Tanzania, Madagascar, SA) using Mobile Phones from different manufacturers & recorded solicited coughs in a clinic for a duration of 0.5 seconds. A plethora of demographic and clinical variables were provided of which a subset was used by TimBre algorithm. Most importantly, the .WAV files were recorded in a single channel at a sampling rate of 44.1kHz & 16 bits. The study details two approaches wherein the first method was to concatenate all the 0.5 second WAV files based on a timestamp provided for each StudyID in the training & scoring set while the second method involved using the 0.5 second snippets as-is in both training and validation sets without any concatenation. The first approach on the TEST set yielded a sensitivity and specificity (table-1) of 68.6% and 71.7% respectively with an AUC of 0.75 while the second approach yielded a sensitivity & specificity (table-2) of 75.41% and 68.30% respectively with an AUC of 0.78 as reported by UCSF R2D2 team. Both the approaches used a combination of Clinical, Demographic and Spectral Variables. Some additional variables included were derived (BMI) & excluded (Spectral) based on the feature importance scores. The ML model performed better in the second approach and we anticipate it to improve further once an additional 714,922 .WAV files harvested as Longitudinal coughs shall be appended to the training set as a part of a subsequent pilot study","PeriodicalId":229002,"journal":{"name":"Journal of Pulmonology Research & Reports","volume":"246 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Pulmonology Research & Reports","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.47363/jprr/2023(5)144","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

TimBre from Docturnal offers multidirectional screening of Lung Ailments – Pulmonary Tuberculosis, Pneumonia, Covid19 & COPD. Detailed studies of TimBre in the past used third party Microphone Array that focused on a XY arrangement that provided high fidelity cough sounds with an average length of >5 seconds and real-time demographic data such as Height, Weight, BMI [1]. In the current study, cough sounds were harvested from 7 different countries (India, Vietnam, Philippines, Uganda, Tanzania, Madagascar, SA) using Mobile Phones from different manufacturers & recorded solicited coughs in a clinic for a duration of 0.5 seconds. A plethora of demographic and clinical variables were provided of which a subset was used by TimBre algorithm. Most importantly, the .WAV files were recorded in a single channel at a sampling rate of 44.1kHz & 16 bits. The study details two approaches wherein the first method was to concatenate all the 0.5 second WAV files based on a timestamp provided for each StudyID in the training & scoring set while the second method involved using the 0.5 second snippets as-is in both training and validation sets without any concatenation. The first approach on the TEST set yielded a sensitivity and specificity (table-1) of 68.6% and 71.7% respectively with an AUC of 0.75 while the second approach yielded a sensitivity & specificity (table-2) of 75.41% and 68.30% respectively with an AUC of 0.78 as reported by UCSF R2D2 team. Both the approaches used a combination of Clinical, Demographic and Spectral Variables. Some additional variables included were derived (BMI) & excluded (Spectral) based on the feature importance scores. The ML model performed better in the second approach and we anticipate it to improve further once an additional 714,922 .WAV files harvested as Longitudinal coughs shall be appended to the training set as a part of a subsequent pilot study
使用UCSF R2D2提供的训练和验证数据进行的“音色”试点研究,用于使用咳嗽(声学声音),临床和人口统计输入筛查肺结核
来自docnal的TimBre提供肺部疾病的多向筛查-肺结核,肺炎,covid - 19和COPD。过去对音色的详细研究使用第三方麦克风阵列,专注于XY排列,提供高保真咳嗽声,平均长度为>5秒,实时人口统计数据,如身高,体重,BMI[1]。在目前的研究中,咳嗽声来自7个不同的国家(印度、越南、菲律宾、乌干达、坦桑尼亚、马达加斯加、南非),使用来自不同制造商的手机,并在诊所记录持续0.5秒的咳嗽声。提供了大量的人口统计学和临床变量,TimBre算法使用了其中的一个子集。最重要的是,。wav文件以44.1kHz & 16位的采样率记录在单个通道中。该研究详细介绍了两种方法,其中第一种方法是根据为训练和评分集中的每个StudyID提供的时间戳连接所有0.5秒WAV文件,而第二种方法涉及在训练和验证集中使用0.5秒片段,而不进行任何连接。根据UCSF R2D2团队的报告,第一种方法在TEST集上的灵敏度和特异性分别为68.6%和71.7%,AUC为0.75;第二种方法的灵敏度和特异性分别为75.41%和68.30%,AUC为0.78。这两种方法都结合了临床、人口统计学和光谱变量。一些额外的变量包括导出(BMI)和排除(谱)基于特征的重要性得分。ML模型在第二种方法中表现得更好,我们预计,一旦作为纵向咳嗽收集的额外714,922 .WAV文件被附加到训练集中,作为后续试点研究的一部分,它将进一步改善
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信