HerbMet: Enhancing metabolomics data analysis for accurate identification of Chinese herbal medicines using deep learning.

IF 3 3区 生物学 Q2 BIOCHEMICAL RESEARCH METHODS
Yuyang Sha, Meiting Jiang, Gang Luo, Weiyu Meng, Xiaobing Zhai, Hongxin Pan, Junrong Li, Yan Yan, Yongkang Qiao, Wenzhi Yang, Kefeng Li
{"title":"HerbMet: Enhancing metabolomics data analysis for accurate identification of Chinese herbal medicines using deep learning.","authors":"Yuyang Sha, Meiting Jiang, Gang Luo, Weiyu Meng, Xiaobing Zhai, Hongxin Pan, Junrong Li, Yan Yan, Yongkang Qiao, Wenzhi Yang, Kefeng Li","doi":"10.1002/pca.3437","DOIUrl":null,"url":null,"abstract":"<p><strong>Introduction: </strong>Chinese herbal medicines have been utilized for thousands of years to prevent and treat diseases. Accurate identification is crucial since their medicinal effects vary between species and varieties. Metabolomics is a promising approach to distinguish herbs. However, current metabolomics data analysis and modeling in Chinese herbal medicines are limited by small sample sizes, high dimensionality, and overfitting.</p><p><strong>Objectives: </strong>This study aims to use metabolomics data to develop HerbMet, a high-performance artificial intelligence system for accurately identifying Chinese herbal medicines, particularly those from different species of the same genus.</p><p><strong>Methods: </strong>We propose HerbMet, an AI-based system for accurately identifying Chinese herbal medicines. HerbMet employs a 1D-ResNet architecture to extract discriminative features from input samples and uses a multilayer perceptron for classification. Additionally, we design the double dropout regularization module to alleviate overfitting and improve model's performance.</p><p><strong>Results: </strong>Compared to 10 commonly used machine learning and deep learning methods, HerbMet achieves superior accuracy and robustness, with an accuracy of 0.9571 and an F1-score of 0.9542 for distinguishing seven similar Panax ginseng species. After feature selection by 25 different feature ranking techniques in combination with prior knowledge, we obtained 100% accuracy and an F1-score for discriminating P. ginseng species. Furthermore, HerbMet exhibits acceptable inference speed and computational costs compared to existing approaches on both CPU and GPU.</p><p><strong>Conclusions: </strong>HerbMet surpasses existing solutions for identifying Chinese herbal medicines species. It is simple to use in real-world scenarios, eliminating the need for feature ranking and selection in classical machine learning-based methods.</p>","PeriodicalId":20095,"journal":{"name":"Phytochemical Analysis","volume":" ","pages":""},"PeriodicalIF":3.0000,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Phytochemical Analysis","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1002/pca.3437","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Chinese herbal medicines have been utilized for thousands of years to prevent and treat diseases. Accurate identification is crucial since their medicinal effects vary between species and varieties. Metabolomics is a promising approach to distinguish herbs. However, current metabolomics data analysis and modeling in Chinese herbal medicines are limited by small sample sizes, high dimensionality, and overfitting.

Objectives: This study aims to use metabolomics data to develop HerbMet, a high-performance artificial intelligence system for accurately identifying Chinese herbal medicines, particularly those from different species of the same genus.

Methods: We propose HerbMet, an AI-based system for accurately identifying Chinese herbal medicines. HerbMet employs a 1D-ResNet architecture to extract discriminative features from input samples and uses a multilayer perceptron for classification. Additionally, we design the double dropout regularization module to alleviate overfitting and improve model's performance.

Results: Compared to 10 commonly used machine learning and deep learning methods, HerbMet achieves superior accuracy and robustness, with an accuracy of 0.9571 and an F1-score of 0.9542 for distinguishing seven similar Panax ginseng species. After feature selection by 25 different feature ranking techniques in combination with prior knowledge, we obtained 100% accuracy and an F1-score for discriminating P. ginseng species. Furthermore, HerbMet exhibits acceptable inference speed and computational costs compared to existing approaches on both CPU and GPU.

Conclusions: HerbMet surpasses existing solutions for identifying Chinese herbal medicines species. It is simple to use in real-world scenarios, eliminating the need for feature ranking and selection in classical machine learning-based methods.

HerbMet:利用深度学习加强代谢组学数据分析,准确识别中药材。
简介几千年来,人们一直利用中草药来预防和治疗疾病。由于中草药的药效因品种和种类而异,因此准确鉴别至关重要。代谢组学是区分中草药的一种很有前景的方法。然而,目前中药材的代谢组学数据分析和建模受到样本量小、维度高和过度拟合的限制:本研究旨在利用代谢组学数据开发高性能人工智能系统 HerbMet,用于准确识别中药材,尤其是同属不同种的中药材:我们提出了基于人工智能的中药材精准鉴定系统 HerbMet。HerbMet 采用 1D-ResNet 架构从输入样本中提取鉴别特征,并使用多层感知器进行分类。此外,我们还设计了双 dropout 正则化模块,以减轻过拟合,提高模型性能:与 10 种常用的机器学习和深度学习方法相比,HerbMet 的准确性和鲁棒性更胜一筹,在区分 7 种相似的三七时,准确率为 0.9571,F1 分数为 0.9542。通过 25 种不同的特征排序技术并结合先验知识进行特征选择后,我们获得了 100% 的准确率和 F1 分数。此外,与CPU和GPU上的现有方法相比,HerbMet的推理速度和计算成本都是可以接受的:结论:HerbMet 超越了现有的中药材品种识别解决方案。结论:HerbMet 超越了现有的中药材品种识别解决方案,它在现实世界中使用简单,省去了基于机器学习的经典方法中的特征排序和选择。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Phytochemical Analysis
Phytochemical Analysis 生物-分析化学
CiteScore
6.00
自引率
6.10%
发文量
88
审稿时长
1.7 months
期刊介绍: Phytochemical Analysis is devoted to the publication of original articles concerning the development, improvement, validation and/or extension of application of analytical methodology in the plant sciences. The spectrum of coverage is broad, encompassing methods and techniques relevant to the detection (including bio-screening), extraction, separation, purification, identification and quantification of compounds in plant biochemistry, plant cellular and molecular biology, plant biotechnology, the food sciences, agriculture and horticulture. The Journal publishes papers describing significant novelty in the analysis of whole plants (including algae), plant cells, tissues and organs, plant-derived extracts and plant products (including those which have been partially or completely refined for use in the food, agrochemical, pharmaceutical and related industries). All forms of physical, chemical, biochemical, spectroscopic, radiometric, electrometric, chromatographic, metabolomic and chemometric investigations of plant products (monomeric species as well as polymeric molecules such as nucleic acids, proteins, lipids and carbohydrates) are included within the remit of the Journal. Papers dealing with novel methods relating to areas such as data handling/ data mining in plant sciences will also be welcomed.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信