{"title":"Development of Machine Learning-Based Models for Mutagenicity Predictions with Applications to Non-Sugar Sweeteners.","authors":"Shilpayan Ghosh, Vinay Kumar, Kunal Roy","doi":"10.1002/minf.202400357","DOIUrl":null,"url":null,"abstract":"<p><p>Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.</p>","PeriodicalId":18853,"journal":{"name":"Molecular Informatics","volume":"44 5-6","pages":"e2400357"},"PeriodicalIF":3.1000,"publicationDate":"2025-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Molecular Informatics","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/minf.202400357","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"CHEMISTRY, MEDICINAL","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial sweeteners, often known as non-sugar sweeteners (NSSs), have been utilized as food additives since World War II. However, there is also concern regarding the mutagenicity potential of NSSs. Every new chemical registration in the food and pharmaceutical industries requires an evaluation of its mutagenic potential, which is essential for food safety. Most of the studies focus solely on determining the mutagenicity of NSSs through in vivo trials, which may be troublesome in terms of the time and cost required for experimental evaluation. To avoid the associated complexities concerning experimentation, a new approach methodology by developing machine learning (ML) models for mutagenicity predictions and selecting the best models by a stringent cross-validation analysis is explored. Two random splits (50/50) of a dataset of 6881 organic compounds for model development are used. Consensus predictions are provided for the mutagenic potential of an external set of 332 NSSs using six selected models (three best ML models based on cross-validation using either data splitting strategy) through voting and considering the applicability domain using two different approaches. In addition, to check the reliability of predictions, the model-derived consensus predictions have also been compared to the predictions generated by the k-nearest neighbor method using the virtual models for property evaluation of chemicals within a global architecture platform and the consensus method generated in the toxicity estimation software tool platform. Finally, based on the analysis, six compounds could be prioritized as mutagenic NSSs in this investigation. The developed models have been made available from https://sites.google.com/jadavpuruniversity.in/dtc-lab-software/home/mutagenicity-predictor.
期刊介绍:
Molecular Informatics is a peer-reviewed, international forum for publication of high-quality, interdisciplinary research on all molecular aspects of bio/cheminformatics and computer-assisted molecular design. Molecular Informatics succeeded QSAR & Combinatorial Science in 2010.
Molecular Informatics presents methodological innovations that will lead to a deeper understanding of ligand-receptor interactions, macromolecular complexes, molecular networks, design concepts and processes that demonstrate how ideas and design concepts lead to molecules with a desired structure or function, preferably including experimental validation.
The journal''s scope includes but is not limited to the fields of drug discovery and chemical biology, protein and nucleic acid engineering and design, the design of nanomolecular structures, strategies for modeling of macromolecular assemblies, molecular networks and systems, pharmaco- and chemogenomics, computer-assisted screening strategies, as well as novel technologies for the de novo design of biologically active molecules. As a unique feature Molecular Informatics publishes so-called "Methods Corner" review-type articles which feature important technological concepts and advances within the scope of the journal.