{"title":"Improved YOLOv8-seg for laryngeal structure recognition in medical images.","authors":"Haipo Cui, Jinjing Wu, Tianying Li, Zui Zou, Wenhui Guo, Long Liu, Qianwen Zhang, Xiaoping Huang","doi":"10.62347/BIHI3707","DOIUrl":null,"url":null,"abstract":"<p><strong>Objectives: </strong>Tracheal intubation is a routine procedure in clinical surgeries and emergency situations, essential for maintaining respiration and ensuring airway patency. Due to the complexity of laryngeal structures and the need for rapid airway management in critically ill patients, real-time, accurate identification of key laryngeal structures is crucial for successful intubation. This study presents a real-time laryngeal structure recognition method based on an improved YOLOv8-seg model.</p><p><strong>Methods: </strong>Laryngeal images from retrospective intubation procedures were used to assist clinicians in the rapid and precise identification of critical laryngeal structures, such as the epiglottis, glottis, and vocal cords. The proposed model, named SlimMSDA-YOLO, integrates a lightweight neck structure, Slimneck, into the original YOLOv8n-seg model by combining GSConv and standard convolutions. This modification effectively reduces the floating-point operations and computational resource requirements. Additionally, a multi-scale dilation attention module was incorporated between the neck and head sections to enhance the network's ability to capture features across various receptive fields, thereby improving its focus on critical regions.</p><p><strong>Results: </strong>The SlimMSDA-YOLO model achieved a precision of 90.4%, recall of 84.2%, and mAP50 of 90.1%. The model's Giga Floating Point Operations Per Second was 11.4, and the number of parameters was 3,139,819. These results demonstrate the effectiveness of the proposed method in enhancing both model efficiency and performance.</p><p><strong>Conclusions: </strong>The SlimMSDA-YOLO model is lightweight and efficient, making it ideal for real-time laryngeal structure recognition during intubation. Comparative experiments with other lightweight segmentation networks highlight the effectiveness and superiority of the proposed approach.</p>","PeriodicalId":7731,"journal":{"name":"American journal of translational research","volume":"17 5","pages":"3293-3306"},"PeriodicalIF":1.7000,"publicationDate":"2025-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12170417/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American journal of translational research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.62347/BIHI3707","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q3","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}
引用次数: 0
Abstract
Objectives: Tracheal intubation is a routine procedure in clinical surgeries and emergency situations, essential for maintaining respiration and ensuring airway patency. Due to the complexity of laryngeal structures and the need for rapid airway management in critically ill patients, real-time, accurate identification of key laryngeal structures is crucial for successful intubation. This study presents a real-time laryngeal structure recognition method based on an improved YOLOv8-seg model.
Methods: Laryngeal images from retrospective intubation procedures were used to assist clinicians in the rapid and precise identification of critical laryngeal structures, such as the epiglottis, glottis, and vocal cords. The proposed model, named SlimMSDA-YOLO, integrates a lightweight neck structure, Slimneck, into the original YOLOv8n-seg model by combining GSConv and standard convolutions. This modification effectively reduces the floating-point operations and computational resource requirements. Additionally, a multi-scale dilation attention module was incorporated between the neck and head sections to enhance the network's ability to capture features across various receptive fields, thereby improving its focus on critical regions.
Results: The SlimMSDA-YOLO model achieved a precision of 90.4%, recall of 84.2%, and mAP50 of 90.1%. The model's Giga Floating Point Operations Per Second was 11.4, and the number of parameters was 3,139,819. These results demonstrate the effectiveness of the proposed method in enhancing both model efficiency and performance.
Conclusions: The SlimMSDA-YOLO model is lightweight and efficient, making it ideal for real-time laryngeal structure recognition during intubation. Comparative experiments with other lightweight segmentation networks highlight the effectiveness and superiority of the proposed approach.