PatternsPub Date : 2025-01-10DOI: 10.1016/j.patter.2024.101150
Xilin Shen, Yayan Hou, Xueer Wang, Chunyong Zhang, Jilei Liu, Hongru Shen, Wei Wang, Yichen Yang, Meng Yang, Yang Li, Jin Zhang, Yan Sun, Kexin Chen, Lei Shi, Xiangchun Li
{"title":"A deep learning model for characterizing protein-RNA interactions from sequences at single-base resolution.","authors":"Xilin Shen, Yayan Hou, Xueer Wang, Chunyong Zhang, Jilei Liu, Hongru Shen, Wei Wang, Yichen Yang, Meng Yang, Yang Li, Jin Zhang, Yan Sun, Kexin Chen, Lei Shi, Xiangchun Li","doi":"10.1016/j.patter.2024.101150","DOIUrl":"10.1016/j.patter.2024.101150","url":null,"abstract":"<p><p>Protein-RNA interactions play pivotal roles in regulating transcription, translation, and RNA metabolism. Characterizing these interactions offers key insights into RNA dysregulation mechanisms. Here, we introduce Reformer, a deep learning model that predicts protein-RNA binding affinity from sequence data. Trained on 225 enhanced cross-linking and immunoprecipitation sequencing (eCLIP-seq) datasets encompassing 155 RNA-binding proteins across three cell lines, Reformer achieves high accuracy in predicting binding affinity at single-base resolution. The model uncovers binding motifs that are often undetectable through traditional eCLIP-seq methods. Notably, the motifs learned by Reformer are shown to correlate with RNA processing functions. Validation via electrophoretic mobility shift assays confirms the model's precision in quantifying the impact of mutations on RNA regulation. In summary, Reformer improves the resolution of RNA-protein interaction predictions and aids in prioritizing mutations that influence RNA regulation.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101150"},"PeriodicalIF":6.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783876/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2025-01-10DOI: 10.1016/j.patter.2024.101147
Yang Zhang, Andreas Vitalis
{"title":"Benchmarking the robustness of the correct identification of flexible 3D objects using common machine learning models.","authors":"Yang Zhang, Andreas Vitalis","doi":"10.1016/j.patter.2024.101147","DOIUrl":"10.1016/j.patter.2024.101147","url":null,"abstract":"<p><p>True three-dimensional (3D) data are prevalent in domains such as molecular science or computer vision. In these data, machine learning models are often asked to identify objects subject to intrinsic flexibility. Our study introduces two datasets from molecular science to assess the classification robustness of common model/feature combinations. Molecules are flexible, and shapes alone offer intra-class heterogeneities that yield a high risk for confusions. By blocking training and test sets to reduce overlap, we establish a baseline requiring the trained models to abstract from shape. As training data coverage grows, all tested architectures perform better on unseen data with reduced overfitting. Empirically, 2D embeddings of voxelized data produced the best-performing models. Evidently, both featurization and task-appropriate model design are of continued importance, the latter point reinforced by comparisons to recent, more specialized models. Finally, we show that the shape abstraction learned from database samples extends to samples that are evolving explicitly in time.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101147"},"PeriodicalIF":6.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783895/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2025-01-10DOI: 10.1016/j.patter.2024.101151
Eric Nost, Gretchen Gehrke, Lourdes Vera, Steve Hansen
{"title":"Why the Environmental Data & Governance Initiative is archiving public environmental data.","authors":"Eric Nost, Gretchen Gehrke, Lourdes Vera, Steve Hansen","doi":"10.1016/j.patter.2024.101151","DOIUrl":"10.1016/j.patter.2024.101151","url":null,"abstract":"<p><p>Public data help researchers and civic organizations develop solutions and advance accountability around environmental challenges but are vulnerable to political threats. While the Environmental Data & Governance Initiative archives data to ensure their availability, we also situate data within their political and economic contexts to support good scientific practice and democratic knowledge production for environmental justice.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101151"},"PeriodicalIF":6.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783871/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2025-01-10DOI: 10.1016/j.patter.2024.101148
Osho Rawal, Berk Turhan, Irene Font Peradejordi, Shreya Chandrasekar, Selim Kalayci, Sacha Gnjatic, Jeffrey Johnson, Mehdi Bouhaddou, Zeynep H Gümüş
{"title":"PhosNetVis: A web-based tool for fast kinase-substrate enrichment analysis and interactive 2D/3D network visualizations of phosphoproteomics data.","authors":"Osho Rawal, Berk Turhan, Irene Font Peradejordi, Shreya Chandrasekar, Selim Kalayci, Sacha Gnjatic, Jeffrey Johnson, Mehdi Bouhaddou, Zeynep H Gümüş","doi":"10.1016/j.patter.2024.101148","DOIUrl":"10.1016/j.patter.2024.101148","url":null,"abstract":"<p><p>Protein phosphorylation involves the reversible modification of a protein (substrate) residue by another protein (kinase). Liquid chromatography-mass spectrometry studies are rapidly generating massive protein phosphorylation datasets across multiple conditions. Researchers then must infer kinases responsible for changes in phosphosites of each substrate. However, tools that infer kinase-substrate interactions (KSIs) are not optimized to interactively explore the resulting large and complex networks, significant phosphosites, and states. There is thus an unmet need for a tool that facilitates user-friendly analysis, interactive exploration, visualization, and communication of phosphoproteomics datasets. We present PhosNetVis, a web-based tool for researchers of all computational skill levels to easily infer, generate, and interactively explore KSI networks in 2D or 3D by streamlining phosphoproteomics data analysis steps within a single tool. PhostNetVis lowers barriers for researchers by rapidly generating high-quality visualizations to gain biological insights from their phosphoproteomics datasets. It is available at https://gumuslab.github.io/PhosNetVis/.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101148"},"PeriodicalIF":6.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783894/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2025-01-10DOI: 10.1016/j.patter.2024.101152
Andrew L Hufton
{"title":"How we measure our impact.","authors":"Andrew L Hufton","doi":"10.1016/j.patter.2024.101152","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101152","url":null,"abstract":"","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101152"},"PeriodicalIF":6.7,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783883/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2024-12-16eCollection Date: 2025-01-10DOI: 10.1016/j.patter.2024.101117
Wentao Jiang, Heng Yuan, Wanjun Liu
{"title":"Neuron signal attenuation activation mechanism for deep learning.","authors":"Wentao Jiang, Heng Yuan, Wanjun Liu","doi":"10.1016/j.patter.2024.101117","DOIUrl":"10.1016/j.patter.2024.101117","url":null,"abstract":"<p><p>Neuron signal activation is at the core of deep learning and broadly impacts science and engineering. Despite growing interest in neuron cell stimulation via amplitude current, the activation mechanism of biological neurons has limited application in deep learning due to the lack of a universal mathematical principle suitable for artificial neural networks. Here, we show how deep learning can go beyond the current learning effects through a newly proposed neuron signal activation mechanism. To achieve this, we report a new cross-disciplinary method for neuron signal attenuation, using the inference of differential equations within generalized linear systems to enhance the efficiency of deep learning. We formulate the mathematical model of the efficient activation function, which we refer to as Attenuation (Ant). Ant can represent higher-order derivatives and stabilize data distributions in deep-learning tasks. We demonstrate the effectiveness, stability, and generalization of Ant on many challenging tasks across various neural network architectures.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"6 1","pages":"101117"},"PeriodicalIF":6.7,"publicationDate":"2024-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11783890/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143081206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Integration of large language models and federated learning.","authors":"Chaochao Chen, Xiaohua Feng, Yuyuan Li, Lingjuan Lyu, Jun Zhou, Xiaolin Zheng, Jianwei Yin","doi":"10.1016/j.patter.2024.101098","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101098","url":null,"abstract":"<p><p>As the parameter size of large language models (LLMs) continues to expand, there is an urgent need to address the scarcity of high-quality data. In response, existing research has attempted to make a breakthrough by incorporating federated learning (FL) into LLMs. Conversely, considering the outstanding performance of LLMs in task generalization, researchers have also tried applying LLMs within FL to tackle challenges in relevant domains. The complementarity between LLMs and FL has already ignited widespread research interest. In this review, we aim to deeply explore the integration of LLMs and FL. We propose a research framework dividing the fusion of LLMs and FL into three parts: the combination of LLM sub-technologies with FL, the integration of FL sub-technologies with LLMs, and the overall merger of LLMs and FL. We first provide a comprehensive review of the current state of research in the domain of LLMs combined with FL, including their typical applications, integration advantages, challenges faced, and future directions for resolution. Subsequently, we discuss the practical applications of the combination of LLMs and FL in critical scenarios such as healthcare, finance, and education and provide new perspectives and insights into future research directions for LLMs and FL.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"5 12","pages":"101098"},"PeriodicalIF":6.7,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701858/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2024-12-13DOI: 10.1016/j.patter.2024.101114
Yingji Xia, Xiqun Michael Chen, Sudan Sun
{"title":"Data-knowledge co-driven innovations in engineering and management.","authors":"Yingji Xia, Xiqun Michael Chen, Sudan Sun","doi":"10.1016/j.patter.2024.101114","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101114","url":null,"abstract":"<p><p>Modern intelligent engineering and management scenarios require advanced data utilization methodologies. Here, we propose and discuss data-knowledge co-driven innovations that could address emerging challenges, and we advocate for the adoption of interdisciplinary methodologies in numerous engineering and management applications.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"5 12","pages":"101114"},"PeriodicalIF":6.7,"publicationDate":"2024-12-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701839/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2024-12-09eCollection Date: 2024-12-13DOI: 10.1016/j.patter.2024.101116
Christopher Wiedeman, Ge Wang
{"title":"Decorrelative network architecture for robust electrocardiogram classification.","authors":"Christopher Wiedeman, Ge Wang","doi":"10.1016/j.patter.2024.101116","DOIUrl":"10.1016/j.patter.2024.101116","url":null,"abstract":"<p><p>To achieve adequate trust in patient-critical medical tasks, artificial intelligence must be able to recognize instances where they cannot operate confidently. Ensemble methods are deployed to estimate uncertainty, but models in an ensemble often share the same vulnerabilities to adversarial attacks. We propose an ensemble approach based on feature decorrelation and Fourier partitioning for teaching networks diverse features, reducing the chance of perturbation-based fooling. We test our approach against white-box attacks in single- and multi-channel electrocardiogram classification and adapt adversarial training and DVERGE into an ensemble framework for comparison. Our results indicate that the combination of decorrelation and Fourier partitioning maintains performance on unperturbed data while demonstrating superior uncertainty estimation on projected gradient descent and smooth adversarial attacks of various magnitudes. Furthermore, our approach does not require expensive optimization with adversarial samples during training. These methods can be applied to other tasks for more robust models.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"5 12","pages":"101116"},"PeriodicalIF":6.7,"publicationDate":"2024-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701855/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
PatternsPub Date : 2024-12-06eCollection Date: 2024-12-13DOI: 10.1016/j.patter.2024.101115
Jake Crawford, Maria Chikina, Casey S Greene
{"title":"Best holdout assessment is sufficient for cancer transcriptomic model selection.","authors":"Jake Crawford, Maria Chikina, Casey S Greene","doi":"10.1016/j.patter.2024.101115","DOIUrl":"https://doi.org/10.1016/j.patter.2024.101115","url":null,"abstract":"<p><p>Guidelines in statistical modeling for genomics hold that simpler models have advantages over more complex ones. Potential advantages include cost, interpretability, and improved generalization across datasets or biological contexts. We directly tested the assumption that small gene signatures generalize better by examining the generalization of mutation status prediction models across datasets (from cell lines to human tumors and vice versa) and biological contexts (holding out entire cancer types from pan-cancer data). We compared model selection between solely cross-validation performance and combining cross-validation performance with regularization strength. We did not observe that more regularized signatures generalized better. This result held across both generalization problems and for both linear models (LASSO logistic regression) and non-linear ones (neural networks). When the goal of an analysis is to produce generalizable predictive models, we recommend choosing the ones that perform best on held-out data or in cross-validation instead of those that are smaller or more regularized.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":"5 12","pages":"101115"},"PeriodicalIF":6.7,"publicationDate":"2024-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11701843/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}