SitoshnaPred: Learning phytochemical descriptors to elucidate Ayurvedic herbal potency

IF 5.4 2区医学 Q1 CHEMISTRY, MEDICINAL

Journal of ethnopharmacology Pub Date : 2025-09-18 DOI:10.1016/j.jep.2025.120620

Ashish Panghalia, Vikram Singh

{"title":"SitoshnaPred: Learning phytochemical descriptors to elucidate Ayurvedic herbal potency","authors":"Ashish Panghalia, Vikram Singh","doi":"10.1016/j.jep.2025.120620","DOIUrl":null,"url":null,"abstract":"<div><h3>Ethnopharmacological relevance</h3><div>In Ayurveda, the potency of herbs is traditionally classified into three categories, namely, Sīta (cold), Uṣṇa (hot), and Unuṣṇa (neutral). In this study, a gold standard dataset of 627 herbs with their potency preferences has been developed using the classical literature-based textbooks. With the premise that the herbal potencies must be associated with their phytochemical constituents, a machine learning framework is developed using the molecular descriptors of the phytochemicals. This study will pave the way for characterizing the molecular basis of the traditional wisdom of herbal potencies and for utilizing this knowledge for the future development of artificial intelligence-guided technologies towards predicting the potency and other features of the traditional herbs.</div></div><div><h3>Objectives</h3><div>This study aims to examine the cold and hot nature of Ayurvedic herbs by analyzing their constituent phytochemicals and to develop an ensemble learning-based binary classifier.</div></div><div><h3>Methods</h3><div>We developed a standard dataset of 627 herbs that are commonly used in Ayurvedic formulations and are well classified into the Sīta (cold), Uṣṇa (hot), and Unuṣṇa (neutral) categories as per their potency. Since only 7 herbs were associated with Unuṣṇa category, this class of herbs were not used in the further studies. Firstly, a dataset comprising of 13,534 phytochemicals associated with 454 herbs was developed; no associations could be retrieved for remaining herbs. Further, the phytochemicals distribution into the Sīta and Uṣṇa herbs was studied, and 1613 two-dimensional and 213 three-dimensional molecular descriptors were calculated for each phytochemical. By reducing the dimensionality of the dataset corresponding to the 95 % variability, binary and ternary SitoshnaPred classifiers were developed using the LightGBM algorithm. SHAP (SHapley Additive exPlanations) analysis and Loadings analysis were conducted to interpret the models’ predictions and to identify the most influential molecular descriptors contributing to classification performance.</div></div><div><h3>Results</h3><div>A total of 7193 phytochemicals were identified in the herbs of Sīta group, and 9116 phytochemicals in the herbs of Uṣṇa group. The LightGBM algorithm-based SitoshnaPred models were developed using the top 92 principal components corresponding to 95 % of the total variance. The binary classification model achieved an accuracy of 94.01 % and an AUC of 98.41 %, whereas the ternary classification model attained an accuracy of 79.84 % and an AUC of 98.10 %. SHAP (SHapley Additive exPlanations) analysis, combined with loadings analysis, was used to interpret the models’ predictions and identify the most influential molecular descriptors contributing to classification performance.</div></div><div><h3>Conclusion</h3><div>Molecular descriptors of the constituent phytochemicals carry strong signals of the herbal potency characteristics.</div></div>","PeriodicalId":15761,"journal":{"name":"Journal of ethnopharmacology","volume":"355 ","pages":"Article 120620"},"PeriodicalIF":5.4000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of ethnopharmacology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378874125013121","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

Abstract

Ethnopharmacological relevance

In Ayurveda, the potency of herbs is traditionally classified into three categories, namely, Sīta (cold), Uṣṇa (hot), and Unuṣṇa (neutral). In this study, a gold standard dataset of 627 herbs with their potency preferences has been developed using the classical literature-based textbooks. With the premise that the herbal potencies must be associated with their phytochemical constituents, a machine learning framework is developed using the molecular descriptors of the phytochemicals. This study will pave the way for characterizing the molecular basis of the traditional wisdom of herbal potencies and for utilizing this knowledge for the future development of artificial intelligence-guided technologies towards predicting the potency and other features of the traditional herbs.

Objectives

This study aims to examine the cold and hot nature of Ayurvedic herbs by analyzing their constituent phytochemicals and to develop an ensemble learning-based binary classifier.

Methods

We developed a standard dataset of 627 herbs that are commonly used in Ayurvedic formulations and are well classified into the Sīta (cold), Uṣṇa (hot), and Unuṣṇa (neutral) categories as per their potency. Since only 7 herbs were associated with Unuṣṇa category, this class of herbs were not used in the further studies. Firstly, a dataset comprising of 13,534 phytochemicals associated with 454 herbs was developed; no associations could be retrieved for remaining herbs. Further, the phytochemicals distribution into the Sīta and Uṣṇa herbs was studied, and 1613 two-dimensional and 213 three-dimensional molecular descriptors were calculated for each phytochemical. By reducing the dimensionality of the dataset corresponding to the 95 % variability, binary and ternary SitoshnaPred classifiers were developed using the LightGBM algorithm. SHAP (SHapley Additive exPlanations) analysis and Loadings analysis were conducted to interpret the models’ predictions and to identify the most influential molecular descriptors contributing to classification performance.

Results

A total of 7193 phytochemicals were identified in the herbs of Sīta group, and 9116 phytochemicals in the herbs of Uṣṇa group. The LightGBM algorithm-based SitoshnaPred models were developed using the top 92 principal components corresponding to 95 % of the total variance. The binary classification model achieved an accuracy of 94.01 % and an AUC of 98.41 %, whereas the ternary classification model attained an accuracy of 79.84 % and an AUC of 98.10 %. SHAP (SHapley Additive exPlanations) analysis, combined with loadings analysis, was used to interpret the models’ predictions and identify the most influential molecular descriptors contributing to classification performance.

Conclusion

Molecular descriptors of the constituent phytochemicals carry strong signals of the herbal potency characteristics.

Abstract Image

查看原文本刊更多论文

SitoshnaPred：学习植物化学描述符来阐明印度草药的效力。

民族药理学相关性：在阿育吠陀中，草药的效力传统上分为三类，即：s1（冷），Uṣṇa（热）和Unuṣṇa（中性）。在本研究中，利用经典文献为基础的教科书，建立了627种草药及其效价偏好的金标准数据集。在草药效力必须与其植物化学成分相关联的前提下，使用植物化学物质的分子描述符开发了机器学习框架。这项研究将为描述传统草药功效的分子基础铺平道路，并为未来人工智能引导技术的发展铺平道路，以预测传统草药的功效和其他特征。目的：本研究旨在通过分析阿育吠陀草药的化学成分来研究其寒性和热性，并开发一个基于集成学习的二分类器。方法：我们开发了一个标准数据集，其中包括627种阿育吠陀配方中常用的草药，并根据其效力被很好地分类为s1（冷），Uṣṇa（热）和Unuṣṇa（中性）类别。由于只有7种草药与Unuṣṇa类别相关，因此这类草药未用于进一步的研究。首先，建立了一个包含与454种草药相关的13534种植物化学物质的数据集；未检索到剩余草药的关联。进一步研究了植物化学物质在s & ta和Uṣṇa草本植物中的分布，并计算了每种植物化学物质的1613个二维和213个三维分子描述符。通过降低数据集对应95%变异的维数，使用LightGBM算法开发了二元和三元SitoshnaPred分类器。SHAP （SHapley Additive explanation）分析和Loadings分析用于解释模型的预测，并确定对分类性能最有影响的分子描述符。结果：从sj - ta组共鉴定出7193种植物化学物质，从Uṣṇa组共鉴定出9116种植物化学物质。基于LightGBM算法的SitoshnaPred模型是使用占总方差95%的前92个主成分开发的。二元分类模型的准确率为94.01%，AUC为98.41%，三元分类模型的准确率为79.84%，AUC为98.10%。SHAP （SHapley Additive exPlanations）分析结合负载分析，用于解释模型的预测，并确定对分类性能最有影响的分子描述符。结论：植物化学成分的分子描述符具有很强的药效特征信号。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of ethnopharmacology 医学-全科医学与补充医学

CiteScore

10.30

自引率

5.60%

发文量

967

审稿时长

77 days

期刊介绍： The Journal of Ethnopharmacology is dedicated to the exchange of information and understandings about people''s use of plants, fungi, animals, microorganisms and minerals and their biological and pharmacological effects based on the principles established through international conventions. Early people confronted with illness and disease, discovered a wealth of useful therapeutic agents in the plant and animal kingdoms. The empirical knowledge of these medicinal substances and their toxic potential was passed on by oral tradition and sometimes recorded in herbals and other texts on materia medica. Many valuable drugs of today (e.g., atropine, ephedrine, tubocurarine, digoxin, reserpine) came into use through the study of indigenous remedies. Chemists continue to use plant-derived drugs (e.g., morphine, taxol, physostigmine, quinidine, emetine) as prototypes in their attempts to develop more effective and less toxic medicinals.