Biodata Mining最新文献

筛选
英文 中文
Decoding the genetic comorbidity network of Alzheimer's disease. 解码阿尔茨海默病的遗传合并症网络。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-10-09 DOI: 10.1186/s13040-024-00394-w
Xueli Zhang, Dantong Li, Siting Ye, Shunming Liu, Shuo Ma, Min Li, Qiliang Peng, Lianting Hu, Xianwen Shang, Mingguang He, Lei Zhang
{"title":"Decoding the genetic comorbidity network of Alzheimer's disease.","authors":"Xueli Zhang, Dantong Li, Siting Ye, Shunming Liu, Shuo Ma, Min Li, Qiliang Peng, Lianting Hu, Xianwen Shang, Mingguang He, Lei Zhang","doi":"10.1186/s13040-024-00394-w","DOIUrl":"10.1186/s13040-024-00394-w","url":null,"abstract":"<p><p>Alzheimer's disease (AD) has emerged as the most prevalent and complex neurodegenerative disorder among the elderly population. However, the genetic comorbidity etiology for AD remains poorly understood. In this study, we conducted pleiotropic analysis for 41 AD phenotypic comorbidities, identifying ten genetic comorbidities with 16 pleiotropy genes associated with AD. Through biological functional and network analysis, we elucidated the molecular and functional landscape of AD genetic comorbidities. Furthermore, leveraging the pleiotropic genes and reported biomarkers for AD genetic comorbidities, we identified 50 potential biomarkers for AD diagnosis. Our findings deepen the understanding of the occurrence of AD genetic comorbidities and provide new insights for the search for AD diagnostic markers.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"40"},"PeriodicalIF":4.0,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11465508/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142394496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MDVarP: modifier ~ disease-causing variant pairs predictor. MDVarP:修饰符 ~ 致病变异对预测器。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-10-08 DOI: 10.1186/s13040-024-00392-y
Hong Sun, Yunqin Chen, Liangxiao Ma
{"title":"MDVarP: modifier ~ disease-causing variant pairs predictor.","authors":"Hong Sun, Yunqin Chen, Liangxiao Ma","doi":"10.1186/s13040-024-00392-y","DOIUrl":"10.1186/s13040-024-00392-y","url":null,"abstract":"<p><strong>Background: </strong>Modifiers significantly impact disease phenotypes by modulating the effects of disease-causing variants, resulting in varying disease manifestations among individuals. However, identifying genetic interactions between modifier and disease-causing variants is challenging.</p><p><strong>Results: </strong>We developed MDVarP, an ensemble model comprising 1000 random forest predictors, to identify modifier ~ disease-causing variant combinations. MDVarP achieves high accuracy and precision, as verified using an independent dataset with published evidence of genetic interactions. We identified 25 novel modifier ~ disease-causing variant combinations and obtained supporting evidence for these associations. MDVarP outputs a class label (\"Associated-pair\" or \"Nonrelevant-pair\") and two prediction scores indicating the probability of a true association.</p><p><strong>Conclusions: </strong>MDVarP prioritizes variant pairs associated with phenotypic modulations, enabling more effective mapping of functional contributions from disease-causing and modifier variants. This framework interprets genetic interactions underlying phenotypic variations in human diseases, with potential applications in personalized medicine and disease prevention.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"39"},"PeriodicalIF":4.0,"publicationDate":"2024-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11460193/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142394497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep learning-based approaches for multi-omics data integration and analysis. 基于深度学习的多组学数据整合与分析方法。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-10-02 DOI: 10.1186/s13040-024-00391-z
Jenna L Ballard, Zexuan Wang, Wenrui Li, Li Shen, Qi Long
{"title":"Deep learning-based approaches for multi-omics data integration and analysis.","authors":"Jenna L Ballard, Zexuan Wang, Wenrui Li, Li Shen, Qi Long","doi":"10.1186/s13040-024-00391-z","DOIUrl":"10.1186/s13040-024-00391-z","url":null,"abstract":"<p><strong>Background: </strong>The rapid growth of deep learning, as well as the vast and ever-growing amount of available data, have provided ample opportunity for advances in fusion and analysis of complex and heterogeneous data types. Different data modalities provide complementary information that can be leveraged to gain a more complete understanding of each subject. In the biomedical domain, multi-omics data includes molecular (genomics, transcriptomics, proteomics, epigenomics, metabolomics, etc.) and imaging (radiomics, pathomics) modalities which, when combined, have the potential to improve performance on prediction, classification, clustering and other tasks. Deep learning encompasses a wide variety of methods, each of which have certain strengths and weaknesses for multi-omics integration.</p><p><strong>Method: </strong>In this review, we categorize recent deep learning-based approaches by their basic architectures and discuss their unique capabilities in relation to one another. We also discuss some emerging themes advancing the field of multi-omics integration.</p><p><strong>Results: </strong>Deep learning-based multi-omics integration methods were categorized broadly into non-generative (feedforward neural networks, graph convolutional neural networks, and autoencoders) and generative (variational methods, generative adversarial models, and a generative pretrained model). Generative methods have the advantage of being able to impose constraints on the shared representations to enforce certain properties or incorporate prior knowledge. They can also be used to generate or impute missing modalities. Recent advances achieved by these methods include the ability to handle incomplete data as well as going beyond the traditional molecular omics data types to integrate other modalities such as imaging data.</p><p><strong>Conclusion: </strong>We expect to see further growth in methods that can handle missingness, as this is a common challenge in working with complex and heterogeneous data. Additionally, methods that integrate more data types are expected to improve performance on downstream tasks by capturing a comprehensive view of each sample.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"38"},"PeriodicalIF":4.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11446004/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142367123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Assessing the limitations of relief-based algorithms in detecting higher-order interactions. 评估基于浮雕的算法在检测高阶交互作用方面的局限性。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-10-01 DOI: 10.1186/s13040-024-00390-0
Philip J Freda, Suyu Ye, Robert Zhang, Jason H Moore, Ryan J Urbanowicz
{"title":"Assessing the limitations of relief-based algorithms in detecting higher-order interactions.","authors":"Philip J Freda, Suyu Ye, Robert Zhang, Jason H Moore, Ryan J Urbanowicz","doi":"10.1186/s13040-024-00390-0","DOIUrl":"10.1186/s13040-024-00390-0","url":null,"abstract":"<p><strong>Background: </strong>Epistasis, the interaction between genetic loci where the effect of one locus is influenced by one or more other loci, plays a crucial role in the genetic architecture of complex traits. However, as the number of loci considered increases, the investigation of epistasis becomes exponentially more complex, making the selection of key features vital for effective downstream analyses. Relief-Based Algorithms (RBAs) are often employed for this purpose due to their reputation as \"interaction-sensitive\" algorithms and uniquely non-exhaustive approach. However, the limitations of RBAs in detecting interactions, particularly those involving multiple loci, have not been thoroughly defined. This study seeks to address this gap by evaluating the efficiency of RBAs in detecting higher-order epistatic interactions. Motivated by previous findings that suggest some RBAs may rank predictive features involved in higher-order epistasis negatively, we explore the potential of absolute value ranking of RBA feature weights as an alternative approach for capturing complex interactions. In this study, we assess the performance of ReliefF, MultiSURF, and MultiSURFstar on simulated genetic datasets that model various patterns of genotype-phenotype associations, including 2-way to 5-way genetic interactions, and compare their performance to two control methods: a random shuffle and mutual information.</p><p><strong>Results: </strong>Our findings indicate that while RBAs effectively identify lower-order (2 to 3-way) interactions, their capability to detect higher-order interactions is significantly limited, primarily by large feature count but also by signal noise. Specifically, we observe that RBAs are successful in detecting fully penetrant 4-way XOR interactions using an absolute value ranking approach, but this is restricted to datasets with only 20 total features.</p><p><strong>Conclusions: </strong>These results highlight the inherent limitations of current RBAs and underscore the need for the development of Relief-based approaches with enhanced detection capabilities for the investigation of epistasis, particularly in datasets with large feature counts and complex higher-order interactions.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"37"},"PeriodicalIF":4.0,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11443793/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142362274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identifying heterogeneous subgroups of systemic autoimmune diseases by applying a joint dimension reduction and clustering approach to immunomarkers 通过对免疫标记物采用联合降维和聚类方法识别全身性自身免疫疾病的异质亚组
IF 4.5 3区 生物学
Biodata Mining Pub Date : 2024-09-16 DOI: 10.1186/s13040-024-00389-7
Chia-Wei Chang, Hsin-Yao Wang, Wan-Ying Lin, Yu-Chiang Wang, Wei-Lin Lo, Ting-Wei Lin, Jia-Ruei Yu, Yi-Ju Tseng
{"title":"Identifying heterogeneous subgroups of systemic autoimmune diseases by applying a joint dimension reduction and clustering approach to immunomarkers","authors":"Chia-Wei Chang, Hsin-Yao Wang, Wan-Ying Lin, Yu-Chiang Wang, Wei-Lin Lo, Ting-Wei Lin, Jia-Ruei Yu, Yi-Ju Tseng","doi":"10.1186/s13040-024-00389-7","DOIUrl":"https://doi.org/10.1186/s13040-024-00389-7","url":null,"abstract":"The high complexity of systemic autoimmune diseases (SADs) has hindered precise management. This study aims to investigate heterogeneity in SADs. We applied a joint cluster analysis, which jointed multiple correspondence analysis and k-means, to immunomarkers and measured the heterogeneity of clusters by examining differences in immunomarkers and clinical manifestations. The electronic health records of patients who received an antinuclear antibody test and were diagnosed with SADs, namely systemic lupus erythematosus (SLE), rheumatoid arthritis (RA), and Sjögren’s syndrome (SS), were retrieved between 2001 and 2016 from hospitals in Taiwan. With distinctive patterns of immunomarkers, a total of 11,923 patients with the three SADs were grouped into six clusters. None of the clusters was composed only of a single SAD, and these clusters demonstrated considerable differences in clinical manifestation. Both patients with SLE and SS had a more dispersed distribution in the six clusters. Among patients with SLE, the occurrence of renal compromise was higher in Clusters 3 and 6 (52% and 51%) than in the other clusters (p < 0.001). Cluster 3 also had a high proportion of patients with discoid lupus (60%) than did Cluster 6 (39%; p < 0.001). Patients with SS in Cluster 3 were the most distinctive because of the high occurrence of immunity disorders (63%) and other and unspecified benign neoplasm (58%) with statistical significance compared with the other clusters (all p < 0.05). The immunomarker-driven clustering method could recognise more clinically relevant subgroups of the SADs and would provide a more precise diagnosis basis.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"117 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142258992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Development, evaluation and comparison of machine learning algorithms for predicting in-hospital patient charges for congestive heart failure exacerbations, chronic obstructive pulmonary disease exacerbations and diabetic ketoacidosis 开发、评估和比较用于预测充血性心力衰竭加重、慢性阻塞性肺病加重和糖尿病酮症酸中毒住院患者费用的机器学习算法
IF 4.5 3区 生物学
Biodata Mining Pub Date : 2024-09-12 DOI: 10.1186/s13040-024-00387-9
Monique Arnold, Lathan Liou, Mary Regina Boland
{"title":"Development, evaluation and comparison of machine learning algorithms for predicting in-hospital patient charges for congestive heart failure exacerbations, chronic obstructive pulmonary disease exacerbations and diabetic ketoacidosis","authors":"Monique Arnold, Lathan Liou, Mary Regina Boland","doi":"10.1186/s13040-024-00387-9","DOIUrl":"https://doi.org/10.1186/s13040-024-00387-9","url":null,"abstract":"Hospitalizations for exacerbations of congestive heart failure (CHF), chronic obstructive pulmonary disease (COPD) and diabetic ketoacidosis (DKA) are costly in the United States. The purpose of this study was to predict in-hospital charges for each condition using machine learning (ML) models. We conducted a retrospective cohort study on national discharge records of hospitalized adult patients from January 1st, 2016, to December 31st, 2019. We constructed six ML models (linear regression, ridge regression, support vector machine, random forest, gradient boosting and extreme gradient boosting) to predict total in-hospital cost for admission for each condition. Our models had good predictive performance, with testing R-squared values of 0.701-0.750 (mean of 0.713) for CHF; 0.694-0.724 (mean 0.709) for COPD; and 0.615-0.729 (mean 0.694) for DKA. We identified important key features driving costs, including patient age, length of stay, number of procedures, and elective/nonelective admission. ML methods may be used to accurately predict costs and identify drivers of high cost for COPD exacerbations, CHF exacerbations and DKA. Overall, our findings may inform future studies that seek to decrease the underlying high patient costs for these conditions.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"40 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Private pathological assessment via machine learning and homomorphic encryption 通过机器学习和同态加密进行私人病理评估
IF 4.5 3区 生物学
Biodata Mining Pub Date : 2024-09-10 DOI: 10.1186/s13040-024-00379-9
Ahmad Al Badawi, Mohd Faizal Bin Yusof
{"title":"Private pathological assessment via machine learning and homomorphic encryption","authors":"Ahmad Al Badawi, Mohd Faizal Bin Yusof","doi":"10.1186/s13040-024-00379-9","DOIUrl":"https://doi.org/10.1186/s13040-024-00379-9","url":null,"abstract":"The objective of this research is to explore the applicability of machine learning and fully homomorphic encryption (FHE) in the private pathological assessment, with a focus on the inference phase of support vector machines (SVM) for the classification of confidential medical data. A framework is introduced that utilizes the Cheon-Kim-Kim-Song (CKKS) FHE scheme, facilitating the execution of SVM inference on encrypted datasets. This framework ensures the privacy of patient data and negates the necessity of decryption during the analytical process. Additionally, an efficient feature extraction technique is presented for the transformation of medical imagery into vectorial representations. The system’s evaluation across various datasets substantiates its practicality and efficacy. The proposed method delivers classification accuracy and performance on par with traditional, non-encrypted SVM inference, while upholding a 128-bit security level against established cryptographic attacks targeting the CKKS scheme. The secure inference process is executed within a temporal span of mere seconds. The findings of this study underscore the viability of FHE in enhancing the security and efficiency of bioinformatics analyses, potentially benefiting fields such as cardiology, oncology, and medical imagery. The implications of this research are significant for the future of privacy-preserving machine learning, promoting progress in diagnostic procedures, tailored medical treatments, and clinical investigations.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"71 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge-slanted random forest method for high-dimensional data and small sample size with a feature selection application for gene expression data 针对高维数据和小样本量的知识倾斜随机森林方法与基因表达数据的特征选择应用
IF 4.5 3区 生物学
Biodata Mining Pub Date : 2024-09-10 DOI: 10.1186/s13040-024-00388-8
Erika Cantor, Sandra Guauque-Olarte, Roberto León, Steren Chabert, Rodrigo Salas
{"title":"Knowledge-slanted random forest method for high-dimensional data and small sample size with a feature selection application for gene expression data","authors":"Erika Cantor, Sandra Guauque-Olarte, Roberto León, Steren Chabert, Rodrigo Salas","doi":"10.1186/s13040-024-00388-8","DOIUrl":"https://doi.org/10.1186/s13040-024-00388-8","url":null,"abstract":"The use of prior knowledge in the machine learning framework has been considered a potential tool to handle the curse of dimensionality in genetic and genomics data. Although random forest (RF) represents a flexible non-parametric approach with several advantages, it can provide poor accuracy in high-dimensional settings, mainly in scenarios with small sample sizes. We propose a knowledge-slanted RF that integrates biological networks as prior knowledge into the model to improve its performance and explainability, exemplifying its use for selecting and identifying relevant genes. knowledge-slanted RF is a combination of two stages. First, prior knowledge represented by graphs is translated by running a random walk with restart algorithm to determine the relevance of each gene based on its connection and localization on a protein-protein interaction network. Then, each relevance is used to modify the selection probability to draw a gene as a candidate split-feature in the conventional RF. Experiments in simulated datasets with very small sample sizes $$(n le 30)$$ comparing knowledge-slanted RF against conventional RF and logistic lasso regression, suggest an improved precision in outcome prediction compared to the other methods. The knowledge-slanted RF was completed with the introduction of a modified version of the Boruta feature selection algorithm. Finally, knowledge-slanted RF identified more relevant biological genes, offering a higher level of explainability for users than conventional RF. These findings were corroborated in one real case to identify relevant genes to calcific aortic valve stenosis.","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"10 1","pages":""},"PeriodicalIF":4.5,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142222264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced labor pain monitoring using machine learning and ECG waveform analysis for uterine contraction-induced pain. 利用机器学习和心电图波形分析对子宫收缩引起的疼痛加强分娩疼痛监测。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-09-07 DOI: 10.1186/s13040-024-00383-z
Yuan-Chia Chu, Saint Shiou-Sheng Chen, Kuen-Bao Chen, Jui-Sheng Sun, Tzu-Kuei Shen, Li-Kuei Chen
{"title":"Enhanced labor pain monitoring using machine learning and ECG waveform analysis for uterine contraction-induced pain.","authors":"Yuan-Chia Chu, Saint Shiou-Sheng Chen, Kuen-Bao Chen, Jui-Sheng Sun, Tzu-Kuei Shen, Li-Kuei Chen","doi":"10.1186/s13040-024-00383-z","DOIUrl":"10.1186/s13040-024-00383-z","url":null,"abstract":"<p><strong>Objectives: </strong>This study aims to develop an innovative approach for monitoring and assessing labor pain through ECG waveform analysis, utilizing machine learning techniques to monitor pain resulting from uterine contractions.</p><p><strong>Methods: </strong>The study was conducted at National Taiwan University Hospital between January and July 2020. We collected a dataset of 6010 ECG samples from women preparing for natural spontaneous delivery (NSD). The ECG data was used to develop an ECG waveform-based Nociception Monitoring Index (NoM). The dataset was divided into training (80%) and validation (20%) sets. Multiple machine learning models, including LightGBM, XGBoost, SnapLogisticRegression, and SnapDecisionTree, were developed and evaluated. Hyperparameter optimization was performed using grid search and five-fold cross-validation to enhance model performance.</p><p><strong>Results: </strong>The LightGBM model demonstrated superior performance with an AUC of 0.96 and an accuracy of 90%, making it the optimal model for monitoring labor pain based on ECG data. Other models, such as XGBoost and SnapLogisticRegression, also showed strong performance, with AUC values ranging from 0.88 to 0.95.</p><p><strong>Conclusions: </strong>This study demonstrates that the integration of machine learning algorithms with ECG data significantly enhances the accuracy and reliability of labor pain monitoring. Specifically, the LightGBM model exhibits exceptional precision and robustness in continuous pain monitoring during labor, with potential applicability extending to broader healthcare settings.</p><p><strong>Trial registration: </strong>ClinicalTrials.gov Identifier: NCT04461704.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"32"},"PeriodicalIF":4.0,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11380346/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142146633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The goldmine of GWAS summary statistics: a systematic review of methods and tools. GWAS 摘要统计的金矿:对方法和工具的系统回顾。
IF 4 3区 生物学
Biodata Mining Pub Date : 2024-09-05 DOI: 10.1186/s13040-024-00385-x
Panagiota I Kontou, Pantelis G Bagos
{"title":"The goldmine of GWAS summary statistics: a systematic review of methods and tools.","authors":"Panagiota I Kontou, Pantelis G Bagos","doi":"10.1186/s13040-024-00385-x","DOIUrl":"10.1186/s13040-024-00385-x","url":null,"abstract":"<p><p>Genome-wide association studies (GWAS) have revolutionized our understanding of the genetic architecture of complex traits and diseases. GWAS summary statistics have become essential tools for various genetic analyses, including meta-analysis, fine-mapping, and risk prediction. However, the increasing number of GWAS summary statistics and the diversity of software tools available for their analysis can make it challenging for researchers to select the most appropriate tools for their specific needs. This systematic review aims to provide a comprehensive overview of the currently available software tools and databases for GWAS summary statistics analysis. We conducted a comprehensive literature search to identify relevant software tools and databases. We categorized the tools and databases by their functionality, including data management, quality control, single-trait analysis, and multiple-trait analysis. We also compared the tools and databases based on their features, limitations, and user-friendliness. Our review identified a total of 305 functioning software tools and databases dedicated to GWAS summary statistics, each with unique strengths and limitations. We provide descriptions of the key features of each tool and database, including their input/output formats, data types, and computational requirements. We also discuss the overall usability and applicability of each tool for different research scenarios. This comprehensive review will serve as a valuable resource for researchers who are interested in using GWAS summary statistics to investigate the genetic basis of complex traits and diseases. By providing a detailed overview of the available tools and databases, we aim to facilitate informed tool selection and maximize the effectiveness of GWAS summary statistics analysis.</p>","PeriodicalId":48947,"journal":{"name":"Biodata Mining","volume":"17 1","pages":"31"},"PeriodicalIF":4.0,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11375927/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142141566","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信