Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.

IF 3.1 4区医学 Q3 CHEMISTRY, MEDICINAL

Molecular Informatics Pub Date : 2025-06-01 DOI:10.1002/minf.202400357

Shilpayan Ghosh, Vinay Kumar, Kunal Roy

{"title":"Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.","authors":"Shilpayan Ghosh, Vinay Kumar, Kunal Roy","doi":"10.1002/minf.202400357","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400357"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/minf.202400357","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}

引用次数: 0

Abstract

Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.

查看原文本刊更多论文

基于机器学习的致突变性预测模型及其在非糖甜味剂中的应用。

人工甜味剂，通常被称为非糖甜味剂（nss），自第二次世界大战以来一直被用作食品添加剂。然而，也有人担心nss的致突变性。食品和制药行业的每一种新化学品注册都需要对其致突变潜力进行评估，这对食品安全至关重要。大多数研究仅仅是通过体内试验来确定nss的突变性，这在实验评估的时间和成本方面可能会很麻烦。为了避免与实验相关的复杂性，通过开发机器学习（ML）模型进行诱变预测并通过严格的交叉验证分析选择最佳模型，探索了一种新的方法方法。对6881种有机化合物的数据集进行两次随机分割（50/50），用于模型开发。通过投票和使用两种不同的方法考虑适用性领域，使用六个选定的模型（三个基于交叉验证的最佳ML模型，使用任一数据分割策略）为外部332个nss集的致突变潜力提供了共识预测。此外，为了检查预测的可靠性，还将模型衍生的共识预测与使用全球架构平台中用于化学品属性评估的虚拟模型的k近邻方法和毒性估计软件工具平台中生成的共识方法生成的预测进行了比较。最后，基于分析，6个化合物可优先作为本研究的致突变性nss。已开发的模型可从https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Molecular Informatics CHEMISTRY, MEDICINAL-MATHEMATICAL & COMPUTATIONAL BIOLOGY

CiteScore

7.30

自引率

2.80%

发文量

审稿时长

3 months

期刊介绍： Molecular Informatics is a peer-reviewed, international forum for publication of high-quality, interdisciplinary research on all molecular aspects of bio/cheminformatics and computer-assisted molecular design. Molecular Informatics succeeded QSAR & Combinatorial Science in 2010. Molecular Informatics presents methodological innovations that will lead to a deeper understanding of ligand-receptor interactions, macromolecular complexes, molecular networks, design concepts and processes that demonstrate how ideas and design concepts lead to molecules with a desired structure or function, preferably including experimental validation. The journal''s scope includes but is not limited to the fields of drug discovery and chemical biology, protein and nucleic acid engineering and design, the design of nanomolecular structures, strategies for modeling of macromolecular assemblies, molecular networks and systems, pharmaco- and chemogenomics, computer-assisted screening strategies, as well as novel technologies for the de novo design of biologically active molecules. As a unique feature Molecular Informatics publishes so-called "Methods Corner" review-type articles which feature important technological concepts and advances within the scope of the journal.