Sa-Yoon Park , Ji Soo Park , Jisoo Lee , Hyesu Lee , Yelin Kim , Dong In Suh , Kwangsoo Kim
{"title":"Detection of breath cycles in pediatric lung sounds via an object detection-based transfer learning method","authors":"Sa-Yoon Park , Ji Soo Park , Jisoo Lee , Hyesu Lee , Yelin Kim , Dong In Suh , Kwangsoo Kim","doi":"10.1016/j.bspc.2025.107693","DOIUrl":null,"url":null,"abstract":"<div><div>Auscultation is critical for assessing the respiratory system in children; however, the lack of pediatric lung sound databases impedes the development of automated analysis tools. This study introduces an object detection-based transfer learning method to accurately predict breath cycles in pediatric lung sounds. We utilized a model based on the YOLOv1 architecture, initially pre-trained on an adult lung sound dataset (HF_Lung_v1) and subsequently fine-tuned on a pediatric dataset (SNUCH_Lung). The input feature was the log Mel spectrogram, which effectively captured the relevant frequency and temporal information. The pre-trained model achieved an F1 score of 0.900 ± 0.003 on the HF_Lung_v1 dataset. After fine-tuning, it reached an F1 score of 0.824 ± 0.009 on the SNUCH_Lung dataset, confirming the efficacy of transfer learning. This model surpassed the performance of a baseline model trained solely on the SNUCH_Lung dataset without transfer learning. We also explored the impact of segment length, width, and various audio feature extraction techniques; the optimal results were obtained with 15 s segments, a 2-second width, and the log Mel spectrogram. The model is promising for clinical applications, such as generating large-scale annotated datasets, visualizing and labeling individual breath cycles, and performing correlation analysis with physiological indicators. Future research will focus on expanding the pediatric lung sound database through auto-labeling techniques and integrating the model into stethoscopes for real-time analysis. This study highlights the potential of object detection-based transfer learning in enhancing the accuracy of breath cycle prediction in pediatric lung sounds and advancing pediatric respiratory sound analysis tools.</div></div>","PeriodicalId":55362,"journal":{"name":"Biomedical Signal Processing and Control","volume":"105 ","pages":"Article 107693"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomedical Signal Processing and Control","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1746809425002046","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0
Abstract
Auscultation is critical for assessing the respiratory system in children; however, the lack of pediatric lung sound databases impedes the development of automated analysis tools. This study introduces an object detection-based transfer learning method to accurately predict breath cycles in pediatric lung sounds. We utilized a model based on the YOLOv1 architecture, initially pre-trained on an adult lung sound dataset (HF_Lung_v1) and subsequently fine-tuned on a pediatric dataset (SNUCH_Lung). The input feature was the log Mel spectrogram, which effectively captured the relevant frequency and temporal information. The pre-trained model achieved an F1 score of 0.900 ± 0.003 on the HF_Lung_v1 dataset. After fine-tuning, it reached an F1 score of 0.824 ± 0.009 on the SNUCH_Lung dataset, confirming the efficacy of transfer learning. This model surpassed the performance of a baseline model trained solely on the SNUCH_Lung dataset without transfer learning. We also explored the impact of segment length, width, and various audio feature extraction techniques; the optimal results were obtained with 15 s segments, a 2-second width, and the log Mel spectrogram. The model is promising for clinical applications, such as generating large-scale annotated datasets, visualizing and labeling individual breath cycles, and performing correlation analysis with physiological indicators. Future research will focus on expanding the pediatric lung sound database through auto-labeling techniques and integrating the model into stethoscopes for real-time analysis. This study highlights the potential of object detection-based transfer learning in enhancing the accuracy of breath cycle prediction in pediatric lung sounds and advancing pediatric respiratory sound analysis tools.
期刊介绍:
Biomedical Signal Processing and Control aims to provide a cross-disciplinary international forum for the interchange of information on research in the measurement and analysis of signals and images in clinical medicine and the biological sciences. Emphasis is placed on contributions dealing with the practical, applications-led research on the use of methods and devices in clinical diagnosis, patient monitoring and management.
Biomedical Signal Processing and Control reflects the main areas in which these methods are being used and developed at the interface of both engineering and clinical science. The scope of the journal is defined to include relevant review papers, technical notes, short communications and letters. Tutorial papers and special issues will also be published.