Thi Bang-Suong Nguyen, Hoang-Bac Nguyen, Thi Xuan-Thao Le, Thi Hong-Chau Bui, Le Song-Toan Nguyen, Thao-Huong Nguyen, Truong Cong-Minh Nguyen
{"title":"应用机器学习与MobileNetV2模型快速筛选阴道分泌物样本在阴道炎诊断中的应用。","authors":"Thi Bang-Suong Nguyen, Hoang-Bac Nguyen, Thi Xuan-Thao Le, Thi Hong-Chau Bui, Le Song-Toan Nguyen, Thao-Huong Nguyen, Truong Cong-Minh Nguyen","doi":"10.1038/s41598-025-04626-9","DOIUrl":null,"url":null,"abstract":"<p><p>Vaginitis is a prevalent gynecological condition that impacts women's quality of life, with most women likely to experience it at least once. Traditional diagnosis involves manually observing vaginal discharge samples under a microscope. This process relies heavily on the technician's expertise and is vulnerable to subjective biases. The study aimed to improve diagnostic accuracy by applying machine learning, specifically the MobileNetV2 model, to automate the classification of vaginal discharge samples. This model supports doctors in identifying causative agents of vaginitis, including Gardnerella vaginalis, fungi, and other pathogens like bacteria or Trichomonas vaginalis. A dataset of 3,164 images from 1,582 vaginal discharge samples of women aged 18 and over was analyzed. Images were taken under a 40x optical microscope with a resolution of 800 × 800 pixels and classified into three groups: Group B (mixed bacteria or Trichomonas vaginalis), Group C (Gardnerella vaginalis, identified by clue cells), and Group F (fungi, e.g., Candida albicans, which appear as hyphae or yeast cells in samples). The model was trained using 80% of data for training, 10% for validation, and 10% for testing. Performance was evaluated using two statistical metrics: the F1 score (a measure of accuracy balancing precision and recall) and the AUC-PR (Area Under the Curve of the Precision-Recall curve, a measure of model reliability for imbalanced datasets). The MobileNetV2 model performed well across all datasets, achieving an F1 score > 0.75 and an AUC-PR > 0.80. It demonstrated the best performance in identifying Gardnerella vaginalis (Group C), with both metrics exceeding 0.90. In conclusion, this study highlights MobileNetV2's potential as a rapid screening tool for vaginitis, particularly in identifying Gardnerella vaginalis (F1 score and AUC-PR > 0.90). While challenges have remained in classifying co-infections (e.g., Groups B vs. F), the model's stability across datasets underscores its practical utility. Integrating AI into vaginitis diagnosis could enhance efficiency, reduce human error, and improve early detection, ultimately advancing patient care.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"19171"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12126550/pdf/","citationCount":"0","resultStr":"{\"title\":\"Applying machine learning with MobileNetV2 model for rapid screening of vaginal discharge samples in vaginitis diagnosis.\",\"authors\":\"Thi Bang-Suong Nguyen, Hoang-Bac Nguyen, Thi Xuan-Thao Le, Thi Hong-Chau Bui, Le Song-Toan Nguyen, Thao-Huong Nguyen, Truong Cong-Minh Nguyen\",\"doi\":\"10.1038/s41598-025-04626-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Vaginitis is a prevalent gynecological condition that impacts women's quality of life, with most women likely to experience it at least once. Traditional diagnosis involves manually observing vaginal discharge samples under a microscope. This process relies heavily on the technician's expertise and is vulnerable to subjective biases. The study aimed to improve diagnostic accuracy by applying machine learning, specifically the MobileNetV2 model, to automate the classification of vaginal discharge samples. This model supports doctors in identifying causative agents of vaginitis, including Gardnerella vaginalis, fungi, and other pathogens like bacteria or Trichomonas vaginalis. A dataset of 3,164 images from 1,582 vaginal discharge samples of women aged 18 and over was analyzed. Images were taken under a 40x optical microscope with a resolution of 800 × 800 pixels and classified into three groups: Group B (mixed bacteria or Trichomonas vaginalis), Group C (Gardnerella vaginalis, identified by clue cells), and Group F (fungi, e.g., Candida albicans, which appear as hyphae or yeast cells in samples). The model was trained using 80% of data for training, 10% for validation, and 10% for testing. Performance was evaluated using two statistical metrics: the F1 score (a measure of accuracy balancing precision and recall) and the AUC-PR (Area Under the Curve of the Precision-Recall curve, a measure of model reliability for imbalanced datasets). The MobileNetV2 model performed well across all datasets, achieving an F1 score > 0.75 and an AUC-PR > 0.80. It demonstrated the best performance in identifying Gardnerella vaginalis (Group C), with both metrics exceeding 0.90. In conclusion, this study highlights MobileNetV2's potential as a rapid screening tool for vaginitis, particularly in identifying Gardnerella vaginalis (F1 score and AUC-PR > 0.90). While challenges have remained in classifying co-infections (e.g., Groups B vs. F), the model's stability across datasets underscores its practical utility. Integrating AI into vaginitis diagnosis could enhance efficiency, reduce human error, and improve early detection, ultimately advancing patient care.</p>\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"19171\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12126550/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-04626-9\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-04626-9","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
Applying machine learning with MobileNetV2 model for rapid screening of vaginal discharge samples in vaginitis diagnosis.
Vaginitis is a prevalent gynecological condition that impacts women's quality of life, with most women likely to experience it at least once. Traditional diagnosis involves manually observing vaginal discharge samples under a microscope. This process relies heavily on the technician's expertise and is vulnerable to subjective biases. The study aimed to improve diagnostic accuracy by applying machine learning, specifically the MobileNetV2 model, to automate the classification of vaginal discharge samples. This model supports doctors in identifying causative agents of vaginitis, including Gardnerella vaginalis, fungi, and other pathogens like bacteria or Trichomonas vaginalis. A dataset of 3,164 images from 1,582 vaginal discharge samples of women aged 18 and over was analyzed. Images were taken under a 40x optical microscope with a resolution of 800 × 800 pixels and classified into three groups: Group B (mixed bacteria or Trichomonas vaginalis), Group C (Gardnerella vaginalis, identified by clue cells), and Group F (fungi, e.g., Candida albicans, which appear as hyphae or yeast cells in samples). The model was trained using 80% of data for training, 10% for validation, and 10% for testing. Performance was evaluated using two statistical metrics: the F1 score (a measure of accuracy balancing precision and recall) and the AUC-PR (Area Under the Curve of the Precision-Recall curve, a measure of model reliability for imbalanced datasets). The MobileNetV2 model performed well across all datasets, achieving an F1 score > 0.75 and an AUC-PR > 0.80. It demonstrated the best performance in identifying Gardnerella vaginalis (Group C), with both metrics exceeding 0.90. In conclusion, this study highlights MobileNetV2's potential as a rapid screening tool for vaginitis, particularly in identifying Gardnerella vaginalis (F1 score and AUC-PR > 0.90). While challenges have remained in classifying co-infections (e.g., Groups B vs. F), the model's stability across datasets underscores its practical utility. Integrating AI into vaginitis diagnosis could enhance efficiency, reduce human error, and improve early detection, ultimately advancing patient care.
期刊介绍:
We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections.
Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021).
•Engineering
Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live.
•Physical sciences
Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics.
•Earth and environmental sciences
Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems.
•Biological sciences
Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants.
•Health sciences
The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.