ArXiv最新文献

筛选
英文 中文
Augmenting Human Expertise in Weighted Ensemble Simulations through Deep Learning based Information Bottleneck. 通过基于深度学习的信息瓶颈,在加权集合仿真中增强人类的专业知识。
ArXiv Pub Date : 2024-11-15
Dedi Wang, Pratyush Tiwary
{"title":"Augmenting Human Expertise in Weighted Ensemble Simulations through Deep Learning based Information Bottleneck.","authors":"Dedi Wang, Pratyush Tiwary","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The weighted ensemble (WE) method stands out as a widely used segment-based sampling technique renowned for its rigorous treatment of kinetics. The WE framework typically involves initially mapping the configuration space onto a low-dimensional collective variable (CV) space and then partitioning it into bins. The efficacy of WE simulations heavily depends on the selection of CVs and binning schemes. The recently proposed State Predictive Information Bottleneck (SPIB) method has emerged as a promising tool for automatically constructing CVs from data and guiding enhanced sampling through an iterative manner. In this work, we advance this data-driven pipeline by incorporating prior expert knowledge. Our hybrid approach combines SPIB-learned CVs to enhance sampling in explored regions with expert-based CVs to guide exploration in regions of interest, synergizing the strengths of both methods. Through benchmarking on alanine dipeptide and chignoin systems, we demonstrate that our hybrid approach effectively guides WE simulations to sample states of interest, and reduces run-to-run variances. Moreover, our integration of the SPIB model also enhances the analysis and interpretation of WE simulation data by effectively identifying metastable states and pathways, and offering direct visualization of dynamics.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11213147/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deciphering SCN2A: A comprehensive review of rodent models of Scn2a dysfunction. 解密 SCN2A:全面回顾 Scn2a 功能障碍的啮齿动物模型。
ArXiv Pub Date : 2024-11-15
Katelin E J Scott, Maria F Hermosillo Arrieta, Aislinn J Williams
{"title":"Deciphering <i>SCN2A</i>: A comprehensive review of rodent models of <i>Scn2a</i> dysfunction.","authors":"Katelin E J Scott, Maria F Hermosillo Arrieta, Aislinn J Williams","doi":"","DOIUrl":"","url":null,"abstract":"","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601800/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uncertainty quantification of receptor ligand binding sites prediction. 受体配体结合位点预测的不确定性量化。
ArXiv Pub Date : 2024-11-15
Nanjie Chen, Dongliang Yu, Dmitri Beglov, Mark Kon, Julio Enrique Castrillon-Candas
{"title":"Uncertainty quantification of receptor ligand binding sites prediction.","authors":"Nanjie Chen, Dongliang Yu, Dmitri Beglov, Mark Kon, Julio Enrique Castrillon-Candas","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Recent advancements in protein docking site prediction have highlighted the limitations of traditional rigid docking algorithms, like PIPER, which often neglect critical stochastic elements such as solvent-induced fluctuations. These oversights can lead to inaccuracies in identifying viable docking sites due to the complexity of high-dimensional, stochastic energy manifolds with low regularity. To address this issue, our research introduces a novel model where the molecular shapes of ligands and receptors are represented using multi-variate Karhunen-Lo `eve (KL) expansions. This method effectively captures the stochastic nature of energy manifolds, allowing for a more accurate representation of molecular interactions.Developed as a plugin for PIPER, our scientific computing software enhances the platform, delivering robust uncertainty measures for the energy manifolds of ranked binding sites. Our results demonstrate that top-ranked binding sites, characterized by lower uncertainty in the stochastic energy manifold, align closely with actual docking sites. Conversely, sites with higher uncertainty correlate with less optimal docking positions. This distinction not only validates our approach but also sets a new standard in protein docking predictions, offering substantial implications for future molecular interaction research and drug development.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10854274/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139725324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect of Parametric Variation of Chordae Tendineae Structure on Simulated Atrioventricular Valve Closure. 腱索结构参数变化对模拟房室瓣关闭的影响
ArXiv Pub Date : 2024-11-14
Nicolas R Mangine, Devin W Laurence, Patricia M Sabin, Wensi Wu, Christian Herz, Christopher N Zelonis, Justin S Unger, Csaba Pinter, Andras Lasso, Steve A Maas, Jeffrey A Weiss, Matthew A Jolley
{"title":"Effect of Parametric Variation of Chordae Tendineae Structure on Simulated Atrioventricular Valve Closure.","authors":"Nicolas R Mangine, Devin W Laurence, Patricia M Sabin, Wensi Wu, Christian Herz, Christopher N Zelonis, Justin S Unger, Csaba Pinter, Andras Lasso, Steve A Maas, Jeffrey A Weiss, Matthew A Jolley","doi":"","DOIUrl":"","url":null,"abstract":"<p><strong>Purpose: </strong>Many approaches have been used to model chordae tendineae geometries in finite element simulations of atrioventricular heart valves. Unfortunately, current \"functional\" chordae tendineae geometries lack fidelity (e.g., branching) that would be helpful when informing clinical decisions. The objectives of this work are (i) to improve synthetic chordae tendineae geometry fidelity to consider branching and (ii) to define how the chordae tendineae geometry affects finite element simulations of valve closure.</p><p><strong>Methods: </strong>In this work, we develop an open-source method to construct synthetic chordae tendineae geometries in the SlicerHeart Extension of 3D Slicer. The generated geometries are then used in FEBio finite element simulations of atrioventricular valve function to evaluate how variations in chordae tendineae geometry influence valve behavior. Effects are evaluated using functional and mechanical metrics.</p><p><strong>Results: </strong>Our findings demonstrated that altering the chordae tendineae geometry of a stereotypical mitral valve led to changes in clinically relevant valve metrics (regurgitant orifice area, contact area, and billowing volume) and valve mechanics (first principal strains). Specifically, cross sectional area had the most influence over valve closure metrics, followed by chordae tendineae density, length, radius and branches. We then used this information to showcase the flexibility of our new workflow by altering the chordae tendineae geometry of two additional geometries (mitral valve with annular dilation and tricuspid valve) to improve finite element predictions.</p><p><strong>Conclusion: </strong>This study presents a flexible, open-source method for generating synthetic chordae tendineae with realistic branching structures. Further, we establish relationships between the chordae tendineae geometry and valve functional/mechanical metrics. This research contribution helps enrich our opensource workflow and brings the finite element simulations closer to use in a patient-specific clinical setting.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601809/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking. WelQrate:确定小分子药物发现基准的黄金标准。
ArXiv Pub Date : 2024-11-14
Yunchao Lance Liu, Ha Dong, Xin Wang, Rocco Moretti, Yu Wang, Zhaoqian Su, Jiawei Gu, Bobby Bodenheimer, Charles David Weaver, Jens Meiler, Tyler Derr
{"title":"WelQrate: Defining the Gold Standard in Small Molecule Drug Discovery Benchmarking.","authors":"Yunchao Lance Liu, Ha Dong, Xin Wang, Rocco Moretti, Yu Wang, Zhaoqian Su, Jiawei Gu, Bobby Bodenheimer, Charles David Weaver, Jens Meiler, Tyler Derr","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>While deep learning has revolutionized computer-aided drug discovery, the AI community has predominantly focused on model innovation and placed less emphasis on establishing best benchmarking practices. We posit that without a sound model evaluation framework, the AI community's efforts cannot reach their full potential, thereby slowing the progress and transfer of innovation into real-world drug discovery. Thus, in this paper, we seek to establish a new gold standard for small molecule drug discovery benchmarking, <i>WelQrate</i>. Specifically, our contributions are threefold: <b><i>WelQrate</i> Dataset Collection</b> - we introduce a meticulously curated collection of 9 datasets spanning 5 therapeutic target classes. Our hierarchical curation pipelines, designed by drug discovery experts, go beyond the primary high-throughput screen by leveraging additional confirmatory and counter screens along with rigorous domain-driven preprocessing, such as Pan-Assay Interference Compounds (PAINS) filtering, to ensure the high-quality data in the datasets; <b><i>WelQrate</i> Evaluation Framework</b> - we propose a standardized model evaluation framework considering high-quality datasets, featurization, 3D conformation generation, evaluation metrics, and data splits, which provides a reliable benchmarking for drug discovery experts conducting real-world virtual screening; <b>Benchmarking</b> - we evaluate model performance through various research questions using the <i>WelQrate</i> dataset collection, exploring the effects of different models, dataset quality, featurization methods, and data splitting strategies on the results. In summary, we recommend adopting our proposed <i>WelQrate</i> as the gold standard in small molecule drug discovery benchmarking. The <i>WelQrate</i> dataset collection, along with the curation codes, and experimental scripts are all publicly available at WelQrate.org.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601797/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interpretable generative multimodal neuroimaging-genomics framework for decoding alzheimer's disease. 用于解码阿尔茨海默病的可解释生成式多模态神经成像基因组学框架。
ArXiv Pub Date : 2024-11-14
Giorgio Dolci, Federica Cruciani, Md Abdur Rahaman, Anees Abrol, Jiayu Chen, Zening Fu, Ilaria Boscolo Galazzo, Gloria Menegaz, Vince D Calhoun
{"title":"An interpretable generative multimodal neuroimaging-genomics framework for decoding alzheimer's disease.","authors":"Giorgio Dolci, Federica Cruciani, Md Abdur Rahaman, Anees Abrol, Jiayu Chen, Zening Fu, Ilaria Boscolo Galazzo, Gloria Menegaz, Vince D Calhoun","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is the most prevalent form of dementia, affecting millions worldwide with a progressive decline in cognitive abilities. The AD continuum encompasses a prodromal stage known as Mild Cognitive Impairment (MCI), where patients may either progress to AD (MCIc) or remain stable (MCInc). Understanding the underlying mechanisms of AD requires complementary analyses relying on different data sources, leading to the development of multimodal deep learning models. In this study, we leveraged structural and functional Magnetic Resonance Imaging (sMRI/fMRI) to investigate the disease-induced grey matter and functional network connectivity changes. Moreover, considering AD's strong genetic component, we introduced Single Nucleotide Polymorphisms (SNPs) as a third channel. Given such diverse inputs, missing one or more modalities is a typical concern of multimodal methods. We hence propose a novel deep learning-based classification framework where a generative module employing Cycle Generative Adversarial Networks (cGAN) was adopted for imputing missing data within the latent space. Additionally, we adopted an Explainable Artificial Intelligence (XAI) method, Integrated Gradients (IG), to extract input features' relevance, enhancing our understanding of the learned representations. Two critical tasks were addressed: AD detection and MCI conversion prediction. Experimental results showed that our framework was able to reach the state-of-the-art in the classification of CN vs AD with an average test accuracy of 0.926 ± 0.02. For the MCInc vs MCIc task, we achieved an average prediction accuracy of 0.711 ± 0.01 using the pre-trained model for CN and AD. The interpretability analysis revealed that the classification performance was led by significant grey matter modulations in cortical and subcortical brain areas well known for their association with AD. Moreover, impairments in sensory-motor and visual resting state network connectivity along the disease continuum, as well as mutations in SNPs defining biological processes linked to endocytosis, amyloid-beta, and cholesterol, were identified as contributors to the achieved performance. Overall, our integrative deep learning approach shows promise for AD detection and MCI prediction, while shading light on important biological insights.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11213156/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141473378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MICCAI-CDMRI 2023 QuantConn Challenge Findings on Achieving Robust Quantitative Connectivity through Harmonized Preprocessing of Diffusion MRI. MICCAI-CDMRI 2023 QuantConn 挑战赛 "通过对弥散核磁共振成像进行统一预处理实现强大的定量连接性 "的研究成果。
ArXiv Pub Date : 2024-11-14
Nancy R Newlin, Kurt Schilling, Serge Koudoro, Bramsh Qamar Chandio, Praitayini Kanakaraj, Daniel Moyer, Claire E Kelly, Sila Genc, Jian Chen, Joseph Yuan-Mou Yang, Ye Wu, Yifei He, Jiawei Zhang, Qingrun Zeng, Fan Zhang, Nagesh Adluru, Vishwesh Nath, Sudhir Pathak, Walter Schneider, Anurag Gade, Yogesh Rathi, Tom Hendriks, Anna Vilanova, Maxime Chamberland, Tomasz Pieciak, Dominika Ciupek, Antonio Tristán Vega, Santiago Aja-Fernández, Maciej Malawski, Gani Ouedraogo, Julia Machnio, Christian Ewert, Paul M Thompson, Neda Jahanshad, Eleftherios Garyfallidis, Bennett A Landman
{"title":"MICCAI-CDMRI 2023 QuantConn Challenge Findings on Achieving Robust Quantitative Connectivity through Harmonized Preprocessing of Diffusion MRI.","authors":"Nancy R Newlin, Kurt Schilling, Serge Koudoro, Bramsh Qamar Chandio, Praitayini Kanakaraj, Daniel Moyer, Claire E Kelly, Sila Genc, Jian Chen, Joseph Yuan-Mou Yang, Ye Wu, Yifei He, Jiawei Zhang, Qingrun Zeng, Fan Zhang, Nagesh Adluru, Vishwesh Nath, Sudhir Pathak, Walter Schneider, Anurag Gade, Yogesh Rathi, Tom Hendriks, Anna Vilanova, Maxime Chamberland, Tomasz Pieciak, Dominika Ciupek, Antonio Tristán Vega, Santiago Aja-Fernández, Maciej Malawski, Gani Ouedraogo, Julia Machnio, Christian Ewert, Paul M Thompson, Neda Jahanshad, Eleftherios Garyfallidis, Bennett A Landman","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>White matter alterations are increasingly implicated in neurological diseases and their progression. International-scale studies use diffusion-weighted magnetic resonance imaging (DW-MRI) to qualitatively identify changes in white matter microstructure and connectivity. Yet, quantitative analysis of DW-MRI data is hindered by inconsistencies stemming from varying acquisition protocols. Specifically, there is a pressing need to harmonize the preprocessing of DW-MRI datasets to ensure the derivation of robust quantitative diffusion metrics across acquisitions. In the MICCAI-CDMRI 2023 QuantConn challenge, participants were provided raw data from the same individuals collected on the same scanner but with two different acquisitions and tasked with preprocessing the DW-MRI to minimize acquisition differences while retaining biological variation. Harmonized submissions are evaluated on the reproducibility and comparability of cross-acquisition bundle-wise microstructure measures, bundle shape features, and connectomics. The key innovations of the QuantConn challenge are that (1) we assess bundles and tractography in the context of harmonization for the first time, (2) we assess connectomics in the context of harmonization for the first time, and (3) we have 10x additional subjects over prior harmonization challenge, MUSHAC and 100x over SuperMUDI. We find that bundle surface area, fractional anisotropy, connectome assortativity, betweenness centrality, edge count, modularity, nodal strength, and participation coefficient measures are most biased by acquisition and that machine learning voxel-wise correction, RISH mapping, and NeSH methods effectively reduce these biases. In addition, microstructure measures AD, MD, RD, bundle length, connectome density, efficiency, and path length are least biased by these acquisition differences. A machine learning approach that learned voxel-wise cross-acquisition relationships was the most effective at harmonizing connectomic, microstructure, and macrostructure features, but requires the same subject be scanned at each site co-registered. NeSH, a spatial and angular resampling method, was also effective and has generalizable framework not reliant co-registration. Our code is available at https://github.com/nancynewlin-masi/QuantConn/.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601790/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new computational model for quantifying blood flow dynamics across myogenically-active cerebral arterial networks. 量化肌源性脑动脉网络血流动态的新计算模型。
ArXiv Pub Date : 2024-11-13
Alberto Coccarelli, Ioannis Polydoros, Alex Drysdale, Osama F Harraz, Chennakesava Kadapa
{"title":"A new computational model for quantifying blood flow dynamics across myogenically-active cerebral arterial networks.","authors":"Alberto Coccarelli, Ioannis Polydoros, Alex Drysdale, Osama F Harraz, Chennakesava Kadapa","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Cerebral autoregulation plays a key physiological role by limiting blood flow changes in the face of pressure fluctuations. Although the involved cellular processes are mechanically driven, the quantification of haemodynamic forces in in-vivo settings remains extremely difficult and uncertain. In this work, we propose a novel computational framework for evaluating the blood flow dynamics across networks of myogenically active cerebral arteries, which can modulate their muscular tone to stabilize flow (and perfusion pressure) as well as to limit vascular intramural stress. The introduced framework is built on contractile (myogenically active) vascular wall mechanics and blood flow dynamics models, which can be numerically coupled in either a weak or strong way. We investigate the time dependency of the vascular wall response to pressure changes at both single vessel and network levels. The robustness of the model was assessed by considering different types of inlet signals and numerical settings in an idealized vascular network formed by a middle cerebral artery and its three generations. For the vessel size and boundary conditions considered, weak coupling ensured accurate results with a lower computational cost. To complete the analysis, we evaluated the effect of an upstream pressure surge on the haemodynamics of the vascular network. This provided a clear quantitative picture of how pressure and flow are redistributed across each vessel generation upon inlet pressure changes. This work paves the way for future combined experimental-computational studies aiming to decipher cerebral autoregulation.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601795/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mixed Effects Deep Learning for the interpretable analysis of single cell RNA sequencing data by quantifying and visualizing batch effects. 混合效应深度学习通过量化和可视化批次效应,对单细胞 RNA 测序数据进行可解释的分析。
ArXiv Pub Date : 2024-11-13
Aixa X Andrade, Son Nguyen, Albert Montillo
{"title":"Mixed Effects Deep Learning for the interpretable analysis of single cell RNA sequencing data by quantifying and visualizing batch effects.","authors":"Aixa X Andrade, Son Nguyen, Albert Montillo","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Single-cell RNA sequencing (scRNA-seq) data are often confounded by technical or biological batch effects. Existing deep learning models mitigate these effects but often discard batch-specific information, potentially losing valuable biological insights. We propose a Mixed Effects Deep Learning (MEDL) autoencoder framework that separately models batch-invariant (fixed effects) and batch-specific (random effects) components. By decoupling batch-invariant biological states from batch variations, our framework integrates both into predictive models. Our approach also generates 2D visualizations of how the same cell appears across batches, enhancing interpretability. Retaining both fixed and random effect latent spaces improves classification accuracy. We applied our framework to three datasets spanning the cardiovascular system (Healthy Heart), Autism Spectrum Disorder (ASD), and Acute Myeloid Leukemia (AML). With 147 batches in the Healthy Heart dataset-far exceeding typical numbers-we tested our framework's ability to handle many batches. In the ASD dataset, our approach captured donor heterogeneity between autistic and healthy individuals. In the AML dataset, it distinguished donor heterogeneity despite missing cell types and diseased donors exhibiting both healthy and malignant cells. These results highlight our framework's ability to characterize fixed and random effects, enhance batch effect visualization, and improve prediction accuracy across diverse datasets.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601787/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High fitness paths can connect proteins with low sequence overlap. 高匹配度路径可以连接序列重叠度较低的蛋白质。
ArXiv Pub Date : 2024-11-13
Pranav Kantroo, Günter P Wagner, Benjamin B Machta
{"title":"High fitness paths can connect proteins with low sequence overlap.","authors":"Pranav Kantroo, Günter P Wagner, Benjamin B Machta","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>The structure and function of a protein are determined by its amino acid sequence. While random mutations change a protein's sequence, evolutionary forces shape its structural fold and biological activity. Studies have shown that neutral networks can connect a local region of sequence space by single residue mutations that preserve viability. However, the larger-scale connectedness of protein morphospace remains poorly understood. Recent advances in artificial intelligence have enabled us to computationally predict a protein's structure and quantify its functional plausibility. Here we build on these tools to develop an algorithm that generates viable paths between distantly related extant protein pairs. The intermediate sequences in these paths differ by single residue changes over subsequent steps - substitutions, insertions and deletions are admissible moves. Their fitness is evaluated using the protein language model ESM2, and maintained as high as possible subject to the constraints of the traversal. We document the qualitative variation across paths generated between progressively divergent protein pairs, some of which do not even acquire the same structural fold. The ease of interpolating between two sequences could be used as a proxy for the likelihood of homology between them.</p>","PeriodicalId":93888,"journal":{"name":"ArXiv","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11601789/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142741595","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信