Accurate fatty liver disease diagnosis with a multi-source feature fusion model on the segmented tongue image dataset

IF 13 1区 综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES
Jie Gao, Tao Chen, Yong Xu, Yijie Wu, Kunhong Liu, Weihong Qiu, Weimin Ye
{"title":"Accurate fatty liver disease diagnosis with a multi-source feature fusion model on the segmented tongue image dataset","authors":"Jie Gao, Tao Chen, Yong Xu, Yijie Wu, Kunhong Liu, Weihong Qiu, Weimin Ye","doi":"10.1016/j.jare.2025.10.003","DOIUrl":null,"url":null,"abstract":"<h3>Introduction</h3>More than 100 million individuals in rural areas of China are suffered from Fatty Liver Disease (FLD). However, health clinics in remote regions often lack the necessary professional expertise and expensive ultrasound equipment for regular liver disease screening. Delayed treatment frequently leads to liver cirrhosis and cancer, imposing substantial economic burden on both public health systems and affected families.<h3>Objectives</h3>Traditional Chinese Medicine emphasizes the strong association between tongue characteristics and liver health. Leveraging machine learning to model the relationship between tongue images and FLD can enable rapid, non-invasive, large-scale screening in medically underserved areas. However, existing studies in this domain often rely on small-scale private datasets, which can result in unverifiable model performance. Moreover, most studies have employed generic convolutional neural networks for feature extraction, causing a lack of interpretability. The goal of our research is to address above-mentioned questions.<h3>Methods</h3>In this study, we first introduced a Multi-source Feature Fusion-based Tongue Diagnosis Framework for FLD diagnosis (MFF-TDF). In addition, we developed and released a standardized tongue image dataset with physiological indicators and FLD annotations, comprising 5,717 samples, which to our knowledge is the largest public dataset in this domain. Finally, we evaluated the effectiveness of the proposed method through extensive experiments and enhanced model interpretability using shapley additive explanations and counterfactual analysis.<h3>Results</h3>When conducting fusion modeling with tongue images and some basic physiological indicators (such as sex, age, height, etc.), FLD’s prediction performance in the population reached F1-score 0.797, Recall 0.847, and AUC 0.924. This performance significantly exceeds that of the state-of-the-art methods published in this domain.<h3>Conclusion</h3>This study developed an automated and explainable method for tongue diagnosis that facilitated the low-cost, speedy screening of FLD in large-scale populations, and contributed the largest public dataset to support future modeling research in this field.","PeriodicalId":14952,"journal":{"name":"Journal of Advanced Research","volume":"26 1","pages":""},"PeriodicalIF":13.0000,"publicationDate":"2025-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Advanced Research","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1016/j.jare.2025.10.003","RegionNum":1,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction

More than 100 million individuals in rural areas of China are suffered from Fatty Liver Disease (FLD). However, health clinics in remote regions often lack the necessary professional expertise and expensive ultrasound equipment for regular liver disease screening. Delayed treatment frequently leads to liver cirrhosis and cancer, imposing substantial economic burden on both public health systems and affected families.

Objectives

Traditional Chinese Medicine emphasizes the strong association between tongue characteristics and liver health. Leveraging machine learning to model the relationship between tongue images and FLD can enable rapid, non-invasive, large-scale screening in medically underserved areas. However, existing studies in this domain often rely on small-scale private datasets, which can result in unverifiable model performance. Moreover, most studies have employed generic convolutional neural networks for feature extraction, causing a lack of interpretability. The goal of our research is to address above-mentioned questions.

Methods

In this study, we first introduced a Multi-source Feature Fusion-based Tongue Diagnosis Framework for FLD diagnosis (MFF-TDF). In addition, we developed and released a standardized tongue image dataset with physiological indicators and FLD annotations, comprising 5,717 samples, which to our knowledge is the largest public dataset in this domain. Finally, we evaluated the effectiveness of the proposed method through extensive experiments and enhanced model interpretability using shapley additive explanations and counterfactual analysis.

Results

When conducting fusion modeling with tongue images and some basic physiological indicators (such as sex, age, height, etc.), FLD’s prediction performance in the population reached F1-score 0.797, Recall 0.847, and AUC 0.924. This performance significantly exceeds that of the state-of-the-art methods published in this domain.

Conclusion

This study developed an automated and explainable method for tongue diagnosis that facilitated the low-cost, speedy screening of FLD in large-scale populations, and contributed the largest public dataset to support future modeling research in this field.

Abstract Image

基于舌头图像数据集的多源特征融合模型对脂肪肝的准确诊断
中国农村地区有超过1亿人患有脂肪肝(FLD)。然而,偏远地区的卫生诊所往往缺乏必要的专业知识和昂贵的超声波设备,无法进行定期的肝脏疾病筛查。延误治疗往往导致肝硬化和癌症,给公共卫生系统和受影响家庭造成巨大的经济负担。目的中医强调舌特征与肝脏健康之间的密切联系。利用机器学习来模拟舌头图像和FLD之间的关系,可以在医疗服务不足的地区实现快速、非侵入性、大规模的筛查。然而,该领域的现有研究往往依赖于小规模的私有数据集,这可能导致模型性能无法验证。此外,大多数研究使用通用卷积神经网络进行特征提取,导致缺乏可解释性。我们研究的目的就是为了解决上述问题。方法在本研究中,我们首次引入了基于多源特征融合的舌诊框架(MFF-TDF)。此外,我们开发并发布了一个包含生理指标和FLD注释的标准化舌头图像数据集,包含5717个样本,这是我们所知的该领域最大的公共数据集。最后,我们通过大量的实验和使用shapley加性解释和反事实分析增强的模型可解释性来评估所提出方法的有效性。结果舌形图像与一些基本生理指标(如性别、年龄、身高等)进行融合建模时,FLD在人群中的预测性能达到F1-score 0.797, Recall 0.847, AUC 0.924。这种性能大大超过了在该领域发表的最先进的方法。本研究开发了一种自动化的、可解释的舌头诊断方法,促进了大规模人群FLD的低成本、快速筛查,并提供了最大的公共数据集,为该领域的未来建模研究提供了支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Advanced Research
Journal of Advanced Research Multidisciplinary-Multidisciplinary
CiteScore
21.60
自引率
0.90%
发文量
280
审稿时长
12 weeks
期刊介绍: Journal of Advanced Research (J. Adv. Res.) is an applied/natural sciences, peer-reviewed journal that focuses on interdisciplinary research. The journal aims to contribute to applied research and knowledge worldwide through the publication of original and high-quality research articles in the fields of Medicine, Pharmaceutical Sciences, Dentistry, Physical Therapy, Veterinary Medicine, and Basic and Biological Sciences. The following abstracting and indexing services cover the Journal of Advanced Research: PubMed/Medline, Essential Science Indicators, Web of Science, Scopus, PubMed Central, PubMed, Science Citation Index Expanded, Directory of Open Access Journals (DOAJ), and INSPEC.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信