Briefings in bioinformatics最新文献

筛选
英文 中文
scSAMAC: saliency-adjusted masking induced attention contrastive learning for single-cell clustering. 单细胞聚类的显著性调节掩蔽诱导注意对比学习。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf128
Bo Li, Yongkang Zhao, Jing Hu, Shihua Zhang, Xiaolong Zhang
{"title":"scSAMAC: saliency-adjusted masking induced attention contrastive learning for single-cell clustering.","authors":"Bo Li, Yongkang Zhao, Jing Hu, Shihua Zhang, Xiaolong Zhang","doi":"10.1093/bib/bbaf128","DOIUrl":"10.1093/bib/bbaf128","url":null,"abstract":"<p><p>Single-cell sequencing technology has enabled researchers to study cellular heterogeneity at the cell level. To facilitate the downstream analysis, clustering single-cell data into subgroups is essential. However, the high dimensionality, sparsity, and dropout events of the data make the clustering challenging. Currently, many deep learning methods have been proposed. Nevertheless, they either fail to fully utilize pairwise distances information between similar cells, or do not adequately capture their feature correlations. They cannot also effectively handle high-dimensional sparse data. Therefore, they are not suitable for high-fidelity clustering, leading to difficulties in analyzing the clear cell types required for downstream analysis. The proposed scSAMAC method integrates contrastive learning and negative binomial losses into a variational autoencoder, extracting features via contrastive unit similarity while preserving the intrinsic characteristics. This enhances the robustness and generalization during the clustering. In the contrastive learning, it constructs a mask module by adopting a negative sample generation method with gene feature saliency adjustment, which selects features more influential in the clustering phase and simulates data missing events. Additionally, it develops a novel loss, which consists of a soft k-means loss, a Wasserstein distance, and a contrastive loss. This fully utilizes data information and improves clustering performance. Furthermore, a multi-head attention mechanism module is applied to the latent variables at each layer of autoencoder to enhance feature correlation, integration, and information repair. Experimental results demonstrate that scSAMAC outperforms several state-of-the-art clustering methods.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11934584/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SC-VAR: a computational tool for interpreting polygenic disease risks using single-cell epigenomic data. SC-VAR:使用单细胞表观基因组数据解释多基因疾病风险的计算工具。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf123
Gefei Zhao, Binbin Lai
{"title":"SC-VAR: a computational tool for interpreting polygenic disease risks using single-cell epigenomic data.","authors":"Gefei Zhao, Binbin Lai","doi":"10.1093/bib/bbaf123","DOIUrl":"10.1093/bib/bbaf123","url":null,"abstract":"<p><strong>Motivation: </strong>One major challenge of interpreting variants from genome-wide association studies (GWAS) of complex traits or diseases is how to efficiently annotate noncoding variants. These variants influence gene expression by disrupting cis-regulatory elements (CREs), whose spatial and cell-type specificity are not adequately captured by conventional tools like multi-marker analysis of genomic annotation. Current methods either rely on linear proximity to genes or quantitative trait locus (QTL) data yet fail to integrate single-cell epigenomic information for a comprehensive annotation.</p><p><strong>Results: </strong>We present SC-VAR, a novel computational tool designed to enhance the interpretation of disease-associated risks from GWAS using single-cell epigenomic data. SC-VAR leverages single-cell epigenomic data to predict functional outcomes including risk genes, pathways, and cell types for both coding and noncoding disease-associated variants. We demonstrate that SC-VAR outperforms state-of-the-art methods by predicting more validated disease-related genes and pathways for multiple diseases. Additionally, SC-VAR identifies cell types that are susceptible to disease, along with their specific CREs and target genes linked to risk. By capturing a broad range of disease risks across human tissues at distinct developmental stages, SC-VAR could enhance our understanding of disease mechanisms in complex tissues across different life stages.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11932087/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143699654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel integrative multimodal classifier to enhance the diagnosis of Parkinson's disease. 一种新的综合多模态分类器提高帕金森病的诊断。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf088
Xiaoyan Zhou, Luca Parisi, Wentao Huang, Yihan Zhang, Xiaoqun Huang, Mansour Youseffi, Farideh Javid, Renfei Ma
{"title":"A novel integrative multimodal classifier to enhance the diagnosis of Parkinson's disease.","authors":"Xiaoyan Zhou, Luca Parisi, Wentao Huang, Yihan Zhang, Xiaoqun Huang, Mansour Youseffi, Farideh Javid, Renfei Ma","doi":"10.1093/bib/bbaf088","DOIUrl":"10.1093/bib/bbaf088","url":null,"abstract":"<p><p>Parkinson's disease (PD) is a complex, progressive neurodegenerative disorder with high heterogeneity, making early diagnosis difficult. Early detection and intervention are crucial for slowing PD progression. Understanding PD's diverse pathways and mechanisms is key to advancing knowledge. Recent advances in noninvasive imaging and multi-omics technologies have provided valuable insights into PD's underlying causes and biological processes. However, integrating these diverse data sources remains challenging, especially when deriving meaningful low-level features that can serve as diagnostic indicators. This study developed and validated a novel integrative, multimodal predictive model for detecting PD based on features derived from multimodal data, including hematological information, proteomics, RNA sequencing, metabolomics, and dopamine transporter scan imaging, sourced from the Parkinson's Progression Markers Initiative. Several model architectures were investigated and evaluated, including support vector machine, eXtreme Gradient Boosting, fully connected neural networks with concatenation and joint modeling (FCNN_C and FCNN_JM), and a multimodal encoder-based model with multi-head cross-attention (MMT_CA). The MMT_CA model demonstrated superior predictive performance, achieving a balanced classification accuracy of 97.7%, thus highlighting its ability to capture and leverage cross-modality inter-dependencies to aid predictive analytics. Furthermore, feature importance analysis using SHapley Additive exPlanations not only identified crucial diagnostic biomarkers to inform the predictive models in this study but also holds potential for future research aimed at integrated functional analyses of PD from a multi-omics perspective, ultimately revealing targets required for precision medicine approaches to aid treatment of PD aimed at slowing down its progression.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891661/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143584798","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Benchmarking copy number aberrations inference tools using single-cell multi-omics datasets. 使用单细胞多组学数据集对拷贝数畸变推理工具进行基准测试。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf076
Minfang Song, Shuai Ma, Gong Wang, Yukun Wang, Zhenzhen Yang, Bin Xie, Tongkun Guo, Xingxu Huang, Liye Zhang
{"title":"Benchmarking copy number aberrations inference tools using single-cell multi-omics datasets.","authors":"Minfang Song, Shuai Ma, Gong Wang, Yukun Wang, Zhenzhen Yang, Bin Xie, Tongkun Guo, Xingxu Huang, Liye Zhang","doi":"10.1093/bib/bbaf076","DOIUrl":"10.1093/bib/bbaf076","url":null,"abstract":"<p><p>Copy number alterations (CNAs) are an important type of genomic variation which play a crucial role in the initiation and progression of cancer. With the explosion of single-cell RNA sequencing (scRNA-seq), several computational methods have been developed to infer CNAs from scRNA-seq studies. However, to date, no independent studies have comprehensively benchmarked their performance. Herein, we evaluated five state-of-the-art methods based on their performance in tumor versus normal cell classification; CNAs profile accuracy, tumor subclone inference, and aneuploidy identification in non-malignant cells. Our results showed that Numbat outperformed others across most evaluation criteria, while CopyKAT excelled in scenarios when expression matrix alone was used as input. In specific tasks, SCEVAN showed the best performance in clonal breakpoint detection and Numbat showed high sensitivity in copy number neutral LOH (cnLOH) detection. Additionally, we investigated how referencing settings, inclusion of tumor microenvironment cells, tumor type, and tumor purity impact the performance of these tools. This study provides a valuable guideline for researchers in selecting the appropriate methods for their datasets.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879432/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143555423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
STForte: tissue context-specific encoding and consistency-aware spatial imputation for spatially resolved transcriptomics. STForte:组织上下文特异性编码和一致性感知的空间输入的空间解析转录组学。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf174
Yuxuan Pang, Chunxuan Wang, Yao-Zhong Zhang, Zhuo Wang, Seiya Imoto, Tzong-Yi Lee
{"title":"STForte: tissue context-specific encoding and consistency-aware spatial imputation for spatially resolved transcriptomics.","authors":"Yuxuan Pang, Chunxuan Wang, Yao-Zhong Zhang, Zhuo Wang, Seiya Imoto, Tzong-Yi Lee","doi":"10.1093/bib/bbaf174","DOIUrl":"https://doi.org/10.1093/bib/bbaf174","url":null,"abstract":"<p><p>Encoding spatially resolved transcriptomics (SRT) data serves to identify the biological semantics of RNA expression within the tissue while preserving spatial characteristics. Depending on the analytical scenario, one may focus on different contextual structures of tissues. For instance, anatomical regions reveal consistent patterns by focusing on spatial homogeneity, while elucidating complex tumor micro-environments requires more expression heterogeneity. However, current spatial encoding methods lack consideration of the tissue context. Meanwhile, most developed SRT technologies are still limited in providing exact patterns of intact tissues due to limitations such as low resolution or missed measurements. Here, we propose STForte, a novel pairwise graph autoencoder-based approach with cross-reconstruction and adversarial distribution matching, to model the spatial homogeneity and expression heterogeneity of SRT data. STForte extracts interpretable latent encodings, enabling downstream analysis by accurately portraying various tissue contexts. Moreover, STForte allows spatial imputation using only spatial consistency to restore the biological patterns of unobserved locations or low-quality cells, thereby providing fine-grained views to enhance the SRT analysis. Extensive evaluations of datasets under different scenarios and SRT platforms demonstrate that STForte is a scalable and versatile tool for providing enhanced insights into spatial data analysis.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12009714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143964537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DMGAT: predicting ncRNA-drug resistance associations based on diffusion map and heterogeneous graph attention network. DMGAT:基于扩散图和异质图注意网络的ncrna -耐药关联预测。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf179
Tingyu Liu, Qiuhao Chen, Renjie Liu, Yuzhi Sun, Yadong Wang, Yan Zhu, Tianyi Zhao
{"title":"DMGAT: predicting ncRNA-drug resistance associations based on diffusion map and heterogeneous graph attention network.","authors":"Tingyu Liu, Qiuhao Chen, Renjie Liu, Yuzhi Sun, Yadong Wang, Yan Zhu, Tianyi Zhao","doi":"10.1093/bib/bbaf179","DOIUrl":"https://doi.org/10.1093/bib/bbaf179","url":null,"abstract":"<p><p>Non-coding RNAs (ncRNAs) play crucial roles in drug resistance and sensitivity, making them important biomarkers and therapeutic targets. However, predicting ncRNA-drug associations is challenging due to issues such as dataset imbalance and sparsity, limiting the identification of robust biomarkers. Existing models often fall short in capturing local and global sequence information, limiting the reliability of predictions. This study introduces DMGAT (diffusion map and heterogeneous graph attention network), a novel deep learning model designed to predict ncRNA-drug associations. DMGAT integrates diffusion maps for sequence embedding, graph convolutional networks for feature extraction, and GAT for heterogeneous information fusion. To address dataset imbalance, the model incorporates sensitivity associations and employs a random forest classifier to select reliable negative samples. DMGAT embeds ncRNA sequences and drug SMILES using the word2vec technique, capturing local and global sequence information. The model constructs a heterogeneous network by combining sequence similarity and Gaussian Interaction Profile kernel similarity, providing a comprehensive representation of ncRNA-drug interactions. Evaluated through five-fold cross-validation on a curated dataset from NoncoRNA and ncDR, DMGAT outperforms seven state-of-the-art methods, achieving the highest area under the receiver operating characteristic curve (0.8964), area under the precision-recall curve (0.8984), recall (0.9576), and F1-score (0.8285). The raw data are released to Zenodo with identifier 13929676. The source code of DMGAT is available at https://github.com/liutingyu0616/DMGAT/tree/main.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12008124/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143954759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Nonlinear embedding and integration of omics data: a fast and tuning-free approach. 组学数据的非线性嵌入与集成:一种快速且无调优的方法。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf184
Shengjie Liu, Tianwei Yu
{"title":"Nonlinear embedding and integration of omics data: a fast and tuning-free approach.","authors":"Shengjie Liu, Tianwei Yu","doi":"10.1093/bib/bbaf184","DOIUrl":"https://doi.org/10.1093/bib/bbaf184","url":null,"abstract":"<p><p>The rapid progress of single-cell technology has facilitated cost-effective acquisition of diverse omics data, allowing biologists to unravel the complexities of cell populations, disease states, and more. Additionally, single-cell multi-omics technologies have opened new avenues for studying biological interactions. However, the high dimensionality and sparsity of omics data present significant analytical challenges. Dimension reduction (DR) techniques are hence essential for analyzing such complex data, yet many existing methods have inherent limitations. Linear methods like principal component analysis (PCA) struggle to capture intricate associations within data. In response, nonlinear techniques have emerged, but they may face scalability issues, be restricted to single-omics data, or prioritize visualization over generating informative embeddings. Here, we introduce dissimilarity based on conditional ordered list (DCOL) correlation, a novel measure for quantifying nonlinear relationships between variables. Based on this measure, we propose DCOL-PCA and DCOL-Canonical Correlation Analysis for dimension reduction and integration of single- and multi-omics data. In simulations, our methods outperformed nine DR methods and four joint dimension reduction methods, demonstrating stable performance across various settings. We also validated these methods on real datasets, with our method demonstrating its ability to detect intricate signals within and between omics data and generate lower dimensional embeddings that preserve the essential information and latent structures.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12009717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143961399","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HPV-KITE: sequence analysis software for rapid HPV genotype detection. HPV- kite:用于快速检测HPV基因型的序列分析软件。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf155
Marek Nowicki, Magdalena Mroczek, Dhananjay Mukhedkar, Piotr Bała, Ville Nikolai Pimenoff, Laila Sara Arroyo Mühr
{"title":"HPV-KITE: sequence analysis software for rapid HPV genotype detection.","authors":"Marek Nowicki, Magdalena Mroczek, Dhananjay Mukhedkar, Piotr Bała, Ville Nikolai Pimenoff, Laila Sara Arroyo Mühr","doi":"10.1093/bib/bbaf155","DOIUrl":"https://doi.org/10.1093/bib/bbaf155","url":null,"abstract":"<p><p>Human papillomaviruses (HPVs) are among the most diverse viral families that infect humans. Fortunately, only a small number of closely related HPV types affect human health, most notably by causing nearly all cervical cancers, as well as some oral and other anogenital cancers, particularly when infections with high-risk HPV types become persistent. Numerous viral polymerase chain reaction-based diagnostic methods as well as sequencing protocols have been developed for accurate, rapid, and efficient HPV genotyping. However, due to the large number of closely related HPV genotypes and the abundance of nonviral DNA in human derived biological samples, it can be challenging to correctly detect HPV genotypes using high throughput deep sequencing. Here, we introduce a novel HPV detection algorithm, HPV-KITE (HPV K-mer Index Tversky Estimator), which leverages k-mer data analysis and utilizes Tversky indexing for DNA and RNA sequence data. This method offers a rapid and sensitive alternative for detecting HPV from both metagenomic and transcriptomic datasets. We assessed HPV-KITE using three previously analyzed HPV infection-related datasets, comprising a total of 1430 sequenced human samples. For benchmarking, we compared our method's performance with standard HPV sequencing analysis algorithms, including general sequence-based mapping, and k-mer-based classification methods. Parallelization demonstrated fast processing times achieved through shingling, and scalability analysis revealed optimal performance when employing multiple nodes. Our results showed that HPV-KITE is one of the fastest, most accurate, and easiest ways to detect HPV genotypes from virtually any next-generation sequencing data. Moreover, the method is also highly scalable and available to be optimized for any microorganism other than HPV.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11982018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143971216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the impact of batch effect associated missing values on downstream analysis in high-throughput biomedical data. 评估批效应相关缺失值对高通量生物医学数据下游分析的影响。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf168
Harvard Wai Hann Hui, Wei Xin Chan, Wilson Wen Bin Goh
{"title":"Assessing the impact of batch effect associated missing values on downstream analysis in high-throughput biomedical data.","authors":"Harvard Wai Hann Hui, Wei Xin Chan, Wilson Wen Bin Goh","doi":"10.1093/bib/bbaf168","DOIUrl":"https://doi.org/10.1093/bib/bbaf168","url":null,"abstract":"<p><p>Batch effect associated missing values (BEAMs) are batch-wide missingness induced from the integration of data with different coverage of biomedical features. BEAMs can present substantial challenges in data analysis. This study investigates how BEAMs impact missing value imputation (MVI) and batch effect (BE) correction algorithms (BECAs). Through simulations and analyses of real-world datasets including the Clinical Proteomic Tumour Analysis Consortium (CPTAC), we evaluated six MVI methods: K-nearest neighbors (KNN), Mean, MinProb, Singular Value Decomposition (SVD), Multivariate Imputation by Chained Equations (MICE), and Random Forest (RF), with ComBat and limma as the BECAs. We demonstrated that BEAMs strongly affect MVI performance, resulting in inaccurate imputed values, inflated significant P-values, and compromised BE correction. KNN, SVD, and RF were particularly prone to propagating random signals, resulting in false statistical confidence. While imputation with Mean and MinProb were less detrimental, artifacts were nonetheless introduced. Furthermore, the detrimental effect of BEAMs increased in parallel with its severity in the data. Our findings highlight the necessity of comprehensive assessments and tailored strategies to handle BEAMs in multi-batch datasets to ensure reliable data analysis and interpretation. Future work should investigate more advanced simulations and a variety of dedicated MVI methods to robustly address BEAMs.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12066825/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143976946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Consensus statement on the credibility assessment of machine learning predictors. 关于机器学习预测器可信度评估的共识声明。
IF 6.8 2区 生物学
Briefings in bioinformatics Pub Date : 2025-03-04 DOI: 10.1093/bib/bbaf100
Alessandra Aldieri, Thiranja Prasad Babarenda Gamage, Antonino Amedeo La Mattina, Axel Loewe, Francesco Pappalardo, Marco Viceconti
{"title":"Consensus statement on the credibility assessment of machine learning predictors.","authors":"Alessandra Aldieri, Thiranja Prasad Babarenda Gamage, Antonino Amedeo La Mattina, Axel Loewe, Francesco Pappalardo, Marco Viceconti","doi":"10.1093/bib/bbaf100","DOIUrl":"10.1093/bib/bbaf100","url":null,"abstract":"<p><p>The rapid integration of machine learning (ML) predictors into in silico medicine has revolutionized the estimation of quantities of interest that are otherwise challenging to measure directly. However, the credibility of these predictors is critical, especially when they inform high-stakes healthcare decisions. This position paper presents a consensus statement developed by experts within the In Silico World Community of Practice. We outline 12 key statements forming the theoretical foundation for evaluating the credibility of ML predictors, emphasizing the necessity of causal knowledge, rigorous error quantification, and robustness to biases. By comparing ML predictors with biophysical models, we highlight unique challenges associated with implicit causal knowledge and propose strategies to ensure reliability and applicability. Our recommendations aim to guide researchers, developers, and regulators in the rigorous assessment and deployment of ML predictors in clinical and biomedical contexts.</p>","PeriodicalId":9209,"journal":{"name":"Briefings in bioinformatics","volume":"26 2","pages":""},"PeriodicalIF":6.8,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11891646/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143584799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信