Artificial intelligence in the life sciences最新文献

LAGOM: A transformer-based chemical language model for drug metabolite prediction LAGOM：基于转换器的药物代谢预测化学语言模型

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-09-17 DOI: 10.1016/j.ailsci.2025.100142

Sofia Larsson , Miranda Carlsson , Richard Beckmann , Filip Miljković , Rocío Mercado

引用次数: 0

MegaEye: Applying multiple machine learning approaches to identify oral compounds with ocular bioactivity MegaEye：应用多种机器学习方法识别具有眼部生物活性的口服化合物

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-09-15 DOI: 10.1016/j.ailsci.2025.100143

Fabio Urbina , Scott H. Greenwald , Patricia A. Vignaux , Thomas R. Lane , Joshua S. Harris , Mayssa Attar , Keith Luhrs , Sean Ekins

{"title":"MegaEye: Applying multiple machine learning approaches to identify oral compounds with ocular bioactivity","authors":"Fabio Urbina , Scott H. Greenwald , Patricia A. Vignaux , Thomas R. Lane , Joshua S. Harris , Mayssa Attar , Keith Luhrs , Sean Ekins","doi":"10.1016/j.ailsci.2025.100143","DOIUrl":"10.1016/j.ailsci.2025.100143","url":null,"abstract":"<div><div>The eye is a complex organ with the critical role of mediating the optical and initial signal processing steps of vision. As such, the eye has multiple physiological and dynamic barriers to protect ocular tissues and compartments. Oral administration of pharmacological agents to treat ocular diseases have often failed to demonstrate efficacy in clinical trials. The ability of a molecule to reach a specific target in the eye (e.g. cells in the anterior versus posterior segment) is largely determined by whether its physicochemical properties permit passage across the various ocular barriers (e.g. cornea, sclera, tear dilution, blood-retinal barrier, lymphatic outflow) that are relevant to the route of administration and the target location. The use of machine learning to predict ocular bioactivity of molecules is underexplored. We now describe the curation of several datasets, generated by a wide array of computational approaches, that are used to identify drugs predicted to reach the eye following oral delivery. These datasets included simple molecular properties (e.g. molecular weight), using the blood-brain barrier MPO score, and machine learning models as a proxy for the blood-retinal barrier using transporter and other relevant literature datasets. FDA approved drugs with reported ocular activity were used to validate the models’ ability to identify additional molecules not in the models. Finally, we used a large language model, to rank over 400,000 natural compounds by potential activity in the eye. In summary, we illustrate machine learning model applications that can be expanded for ocular applications in future to repurpose molecules.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100143"},"PeriodicalIF":5.4,"publicationDate":"2025-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145104219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

From task-specific language models to research agents 从特定任务的语言模型到研究代理

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-08-26 DOI: 10.1016/j.ailsci.2025.100141

Jürgen Bajorath

引用次数: 0

A crossover-enhanced Marine Predators Algorithm for gene selection in microarray-based cancer classification 基于微阵列的癌症分类中基因选择的交叉增强海洋捕食者算法

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-08-23 DOI: 10.1016/j.ailsci.2025.100140

Sharif Naser Makhadmeh , Yousef Sanjalawe , Mohammed Azmi Al-Betar , Ahmad Nasayreh , Mohammad Aladaileh

{"title":"A crossover-enhanced Marine Predators Algorithm for gene selection in microarray-based cancer classification","authors":"Sharif Naser Makhadmeh , Yousef Sanjalawe , Mohammed Azmi Al-Betar , Ahmad Nasayreh , Mohammad Aladaileh","doi":"10.1016/j.ailsci.2025.100140","DOIUrl":"10.1016/j.ailsci.2025.100140","url":null,"abstract":"<div><div>The DNA microarray technique involves using a chip embedded with numerous DNA sequences to simultaneously estimate the expression of a multitude of genes. This data, laid out in table format, is vital for employing pattern recognition algorithms that distinguish between samples from healthy individuals and those with cancer. However, identifying useful biomarkers within gene selection data presents significant challenges due to its vast dimensionality and the inclusion of noisy, irrelevant genes. To address these challenges, this paper introduces a sophisticated gene selection method using a robust filter called Minimum redundancy maximum relevancy, combined with a novel hybrid optimization algorithm. This algorithm integrates the Improved Marine Predator Optimizer (MPA) with the Crossover operator to form the MPAC method. The MPAC specifically aims to identify a concise set of biomarker genes that substantially improve cancer classification performance. It employs the k-nearest neighbor algorithm for classification tasks. The innovation in MPAC lies in its ability to significantly enhance the performance of the MPA’s search agents. It seeks the most effective gene subsets for cancer biomarkers and is designed to optimize both the depth (exploitation) and breadth (exploration) of the search. The effectiveness of this hybrid approach is rigorously tested against nine well-known microarray datasets. The performance of this hybrid model is compared against other base and advanced optimization algorithms. The findings from these comparisons highlight that the proposed MPAC approach excels in most of the datasets and remains highly competitive across the others.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100140"},"PeriodicalIF":5.4,"publicationDate":"2025-08-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144908437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A machine learning framework for the prediction and analysis of bacterial antagonism in biofilms using morphological descriptors 使用形态描述符预测和分析生物膜中细菌拮抗作用的机器学习框架

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-08-20 DOI: 10.1016/j.ailsci.2025.100137

Raphaël Rubrice , Virgile Gueneau , Romain Briandet , Antoine Cornuejols , Vincent Guigue

{"title":"A machine learning framework for the prediction and analysis of bacterial antagonism in biofilms using morphological descriptors","authors":"Raphaël Rubrice , Virgile Gueneau , Romain Briandet , Antoine Cornuejols , Vincent Guigue","doi":"10.1016/j.ailsci.2025.100137","DOIUrl":"10.1016/j.ailsci.2025.100137","url":null,"abstract":"<div><div>Biofilms are structured microbial communities that promote cell interactions through close spatial organization, leading to cooperative or competitive behaviors. Predicting microbial interactions in biofilms could aid in developing innovative strategies to prevent the colonization of undesirable bacteria. Here, we present a machine learning approach to predict the antagonistic effects of beneficial bacterial candidates <em>Bacillus</em> and <em>Paenibacillus</em> species against undesirable bacteria (<em>Staphylococcus aureus</em>, <em>Enterococcus cecorum</em>, <em>Escherichia coli</em> and <em>Salmonella enterica</em>), based on the morphological descriptors of single-species biofilms. We trained the models using quantitative features (e.g. biofilm volume, thickness, roughness, or substratum coverage). As a proxy for antagonism, an exclusion score was used as the supervised training target. The latter was calculated based on the ratio of biofilm volume between the undesirable bacteria and the beneficial strain. We subsequently applied diverse explainability methods to analyze the resulting model and found insights highlighting the importance of biofilm formation context when predicting antagonism. Our results demonstrate that machine learning offers an efficient, data-driven tool to predict microbial interactions within biofilms and support the selection of competitive beneficial strains against pathogens. This approach enables scalable screening of microbial interactions, making it applicable to both research and biotechnological applications.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100137"},"PeriodicalIF":5.4,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144893291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A dystopian nightmare for science and how to survive it 科学的反乌托邦噩梦以及如何生存

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-08-05 DOI: 10.1016/j.ailsci.2025.100139

Sean Ekins

引用次数: 0

Unraveling the co-morbidity between COVID-19 and neurodegenerative diseases through multi-scale graph analysis: A systematic investigation of biological databases and text mining 通过多尺度图分析揭示COVID-19与神经退行性疾病的共发病：生物数据库和文本挖掘的系统调查

IF 5.4

Artificial intelligence in the life sciences Pub Date : 2025-07-28 DOI: 10.1016/j.ailsci.2025.100138

Negin Sadat Babaiha , Stefan Geissler , Vincent Nibart , Heval Atas Güvenilir , Vinay Srinivas Bharadhwaj , Alpha Tom Kodamullil , Juergen Klein , Marc Jacobs , Martin Hofmann-Apitius

{"title":"Unraveling the co-morbidity between COVID-19 and neurodegenerative diseases through multi-scale graph analysis: A systematic investigation of biological databases and text mining","authors":"Negin Sadat Babaiha , Stefan Geissler , Vincent Nibart , Heval Atas Güvenilir , Vinay Srinivas Bharadhwaj , Alpha Tom Kodamullil , Juergen Klein , Marc Jacobs , Martin Hofmann-Apitius","doi":"10.1016/j.ailsci.2025.100138","DOIUrl":"10.1016/j.ailsci.2025.100138","url":null,"abstract":"<div><div>The COVID-19 pandemic has generated a vast volume of research, yet much of it focuses on individual diseases, overlooking complex comorbidity relationships. While extensive literature exists on both neurodegenerative diseases (NDDs), such as Alzheimer’s and Parkinson’s, and COVID-19, their intersection remains underexplored. Co-morbidity modeling is crucial, particularly for hospitalized patients often presenting with multiple conditions. This study investigates the interplay between COVID-19 and NDDs by integrating knowledge graphs (KGs) built from curated biomedical datasets and text mining tools. We performed comprehensive analyses—including path analysis, phenotype coverage, and mapping of cellular and genetic factors—across multiple KGs, such as PrimeKG, DrugBank, OpenTargets, and those generated via natural language processing (NLP) methods. Our findings reveal notable variability in graph density and connectivity, with each KG offering unique insights into molecular and phenotypic links between COVID-19 and NDDs. Key genetic and inflammatory markers, especially immune response genes, consistently appeared across graphs, suggesting a shared pathogenic basis. By unifying structured biological data with unstructured textual evidence, we enhance co-morbidity modeling and improve recall in identifying mechanisms underlying COVID-19–NDD interactions. This integrative framework supports the development of a co-morbidity hypothesis database aimed at facilitating therapeutic target discovery. All data, methods, and instructions for accessing the co-morbidity hypothesis database are publicly available at: <span><span>https://github.com/SCAI-BIO/covid-NDD-comorbidity-NLP</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100138"},"PeriodicalIF":5.4,"publicationDate":"2025-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144748670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Temporal distribution shift in real-world pharmaceutical data: Implications for uncertainty quantification in QSAR models 现实世界制药数据的时间分布变化：QSAR模型中不确定性量化的含义

Artificial intelligence in the life sciences Pub Date : 2025-07-10 DOI: 10.1016/j.ailsci.2025.100132

Hannah Rosa Friesacher , Emma Svensson , Susanne Winiwarter , Lewis Mervin , Adam Arany , Ola Engkvist

{"title":"Temporal distribution shift in real-world pharmaceutical data: Implications for uncertainty quantification in QSAR models","authors":"Hannah Rosa Friesacher , Emma Svensson , Susanne Winiwarter , Lewis Mervin , Adam Arany , Ola Engkvist","doi":"10.1016/j.ailsci.2025.100132","DOIUrl":"10.1016/j.ailsci.2025.100132","url":null,"abstract":"<div><div>The estimation of uncertainties associated with predictions from quantitative structure–activity relationship (QSAR) models can accelerate the drug discovery process by identifying promising experiments and allowing an efficient allocation of resources. Several computational tools exist that estimate the predictive uncertainty in machine learning models. However, deviations from the i.i.d. setting have been shown to impair the performance of these uncertainty quantification methods. We use a real-world pharmaceutical dataset to address the pressing need for a comprehensive, large-scale evaluation of uncertainty quantification approaches in the context of realistic distribution shifts over time. We investigate the performance of several popular uncertainty estimation methods for classification models, including ensemble-based and Bayesian approaches. Furthermore, we use this real-world setting to systematically assess the distribution shifts in label and descriptor space and their impact on the capability of the uncertainty quantification methods. Our study reveals significant shifts over time in both label and descriptor space and a clear connection between the magnitude of the shift and the nature of the assay. Moreover, we show that pronounced distribution shifts impair the performance of popular uncertainty quantification methods used in QSAR models. This work highlights the challenges of identifying uncertainty quantification techniques that remain reliable under distribution shifts introduced by real-world data.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100132"},"PeriodicalIF":0.0,"publicationDate":"2025-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144633004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Artificial intelligence methods and approaches to improve data quality in helthcare data 提高医疗保健数据质量的人工智能方法和途径

Artificial intelligence in the life sciences Pub Date : 2025-07-04 DOI: 10.1016/j.ailsci.2025.100135

Jarmakoviča Agate

{"title":"Artificial intelligence methods and approaches to improve data quality in helthcare data","authors":"Jarmakoviča Agate","doi":"10.1016/j.ailsci.2025.100135","DOIUrl":"10.1016/j.ailsci.2025.100135","url":null,"abstract":"<div><div>This study explores artificial intelligence (AI) methods and approaches used to improve data quality, with a particular focus on healthcare data. Applying a systematic literature review based on the PRISMA framework, the research examines publications from 2020 to 2025 that analyze AI applications across key data quality dimensions—accuracy, completeness, consistency, timeliness, uniqueness, and validity. The study aims to identify which AI methods are most commonly employed and how they align with these quality attributes. A conceptual map was developed to visualize the relationships between dimensions and AI techniques such as deep learning, federated learning, data-centric AI, and ontology-based data governance. Findings reveal that accuracy and consistency are the most emphasized dimensions in the literature, with methods like supervised learning, NLP, and isolation forest frequently applied. In contrast, dimensions like timeliness and validity receive comparatively limited attention. The study concludes that certain AI methods—particularly data-centric and cross-cutting approaches—are effective in addressing multiple data quality challenges simultaneously. These insights offer practical guidance for selecting AI strategies in healthcare data quality improvement and highlight areas for future research.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100135"},"PeriodicalIF":0.0,"publicationDate":"2025-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144632718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Co-folding, the future of docking – prediction of allosteric and orthosteric ligands 共折叠，对接的未来——变构配体和正构配体的预测

Artificial intelligence in the life sciences Pub Date : 2025-06-29 DOI: 10.1016/j.ailsci.2025.100136

Eva Nittinger , Özge Yoluk , Alessandro Tibo , Gustav Olanders , Christian Tyrchan

{"title":"Co-folding, the future of docking – prediction of allosteric and orthosteric ligands","authors":"Eva Nittinger , Özge Yoluk , Alessandro Tibo , Gustav Olanders , Christian Tyrchan","doi":"10.1016/j.ailsci.2025.100136","DOIUrl":"10.1016/j.ailsci.2025.100136","url":null,"abstract":"<div><div>In drug discovery understanding protein structures is essential for comprehending their functions and interactions with drugs. Traditional methods like X-ray crystallography and cryo-electron microscopy have been used to solve these structures. Recently, computational biology has seen a breakthrough with deep learning algorithms capable of predicting protein structures based on amino acid sequences. These methods have now evolved into predicting protein-ligand interactions from sequence – co-folding methods. Despite the great advancement in the field during the last year, there are still open challenges. Here, we focus on the prediction of allosteric binding sites, using a dataset of 17 orthosteric/allosteric ligand sets. Three different co-folding methods – NeuralPLexer, RoseTTAFold All-Atom and Boltz-1/Boltz-1x – were used to predict both allosteric and orthosteric ligands. Using PoseBusters, the ligand quality was checked, with >90 % of ligands predicted by Boltz-1x passing the default quality criteria. Boltz-1, NeuralPLexer and RoseTTAFold All-Atom still showing high quality drawbacks. The orthosteric ligands were well placed. However, instead of the allosteric pocket these deep learning approaches generally favor the orthosteric site, which is the one most represented in the training data.</div></div>","PeriodicalId":72304,"journal":{"name":"Artificial intelligence in the life sciences","volume":"8 ","pages":"Article 100136"},"PeriodicalIF":0.0,"publicationDate":"2025-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144572530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0