Bioinformatics advances最新文献

筛选
英文 中文
Adaptive adjustment of profile HMM significance thresholds improves functional and metabolic insights into microbial genomes.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf039
Kathryn Kananen, Iva Veseli, Christian J Quiles Pérez, Samuel E Miller, A Murat Eren, Patrick H Bradley
{"title":"Adaptive adjustment of profile HMM significance thresholds improves functional and metabolic insights into microbial genomes.","authors":"Kathryn Kananen, Iva Veseli, Christian J Quiles Pérez, Samuel E Miller, A Murat Eren, Patrick H Bradley","doi":"10.1093/bioadv/vbaf039","DOIUrl":"10.1093/bioadv/vbaf039","url":null,"abstract":"<p><strong>Motivation: </strong>Gene function annotation in microbial genomes and metagenomes is a fundamental <i>in silico</i> first step toward understanding metabolic potential and determinants of fitness. The Kyoto Encyclopedia of Genes and Genomes publishes a curated list of profile hidden Markov models to identify orthologous gene families (KOfams) with roles in metabolism. However, the computational tools that rely upon KOfams yield different annotations for the same set of genomes, leading to different downstream biological inferences.</p><p><strong>Results: </strong>Here, we apply three open-source software tools that can annotate KOfams to genomes of phylogenetically diverse bacterial families from host-associated and free-living biomes. We use multiple computational approaches to benchmark these methods and investigate individual case studies where they differ. Our results show that despite their fundamental similarities, these methods have different annotation rates and quality. In particular, a method that adaptively tunes sequence similarity thresholds substantially improves sensitivity while maintaining high accuracy. We observe particularly large improvements for protein families with few reference sequences, or when annotating genomes from nonmodel organisms (such as gut-dwelling <i>Lachnospiraceae</i>). Our findings show that small improvements in annotation workflows can maximize the utility of existing databases and meaningfully improve <i>in silico</i> characterizations of microbial metabolism.</p><p><strong>Availability and implementation: </strong>Anvi'o is available at https://anvio.org under the GNU GPL license. Scripts and workflow are available at https://github.com/pbradleylab/2023-anvio-comparison under the MIT license.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf039"},"PeriodicalIF":2.4,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11964587/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
peptidy: a light-weight Python library for peptide representation in machine learning.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf058
Rıza Özçelik, Laura van Weesep, Sarah de Ruiter, Francesca Grisoni
{"title":"peptidy: a light-weight Python library for peptide representation in machine learning.","authors":"Rıza Özçelik, Laura van Weesep, Sarah de Ruiter, Francesca Grisoni","doi":"10.1093/bioadv/vbaf058","DOIUrl":"10.1093/bioadv/vbaf058","url":null,"abstract":"<p><strong>Motivation: </strong>Peptides are widely used in applications ranging from drug discovery to food technologies. Machine learning has become increasingly prominent in accelerating the search for new peptides, and user-friendly computational tools can further enhance these efforts.</p><p><strong>Results: </strong>In this work, we introduce peptidy-a lightweight Python library that facilitates converting peptides (expressed as amino acid sequences) to numerical representations suited to machine learning. peptidy is free from external dependencies, integrates seamlessly into modern Python environments, and supports a range of encoding strategies suitable for both predictive and generative machine learning approaches. Additionally, peptidy supports peptides with post-translational modifications, such as phosphorylation, acetylation, and methylation, thereby extending the functionality of existing Python packages for peptides and proteins.</p><p><strong>Availability and implementation: </strong>peptidy is freely available with a permissive license on GitHub at the following URL: https://github.com/molML/peptidy.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf058"},"PeriodicalIF":2.4,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11961219/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AntiFold: improved structure-based antibody design using inverse folding.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-21 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbae202
Magnus Haraldson Høie, Alissa M Hummer, Tobias H Olsen, Broncio Aguilar-Sanjuan, Morten Nielsen, Charlotte M Deane
{"title":"AntiFold: improved structure-based antibody design using inverse folding.","authors":"Magnus Haraldson Høie, Alissa M Hummer, Tobias H Olsen, Broncio Aguilar-Sanjuan, Morten Nielsen, Charlotte M Deane","doi":"10.1093/bioadv/vbae202","DOIUrl":"10.1093/bioadv/vbae202","url":null,"abstract":"<p><strong>Summary: </strong>The design and optimization of antibodies requires an intricate balance across multiple properties. Protein inverse folding models, capable of generating diverse sequences folding into the same structure, are promising tools for maintaining structural integrity during antibody design. Here, we present AntiFold, an antibody-specific inverse folding model, fine-tuned from ESM-IF1 on solved and predicted antibody structures. AntiFold outperforms existing inverse folding tools on sequence recovery across complementarity-determining regions, with designed sequences showing high structural similarity to their solved counterpart. It additionally achieves stronger correlations when predicting antibody-antigen binding affinity in a zero-shot manner. AntiFold assigns low probabilities to mutations that disrupt antigen binding, synergizing with protein language model residue probabilities, and demonstrates promise for guiding antibody optimization while retaining structure-related properties.</p><p><strong>Availability and implementation: </strong>AntiFold is freely available under the BSD 3-Clause as a web server (https://opig.stats.ox.ac.uk/webapps/antifold/) and pip-installable package (https://github.com/oxpig/AntiFold).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbae202"},"PeriodicalIF":2.4,"publicationDate":"2025-03-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11961221/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Zepyros: a webserver to evaluate the shape complementarity of protein-protein interfaces.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf051
Mattia Miotto, Lorenzo Di Rienzo, Leonardo Bo', Giancarlo Ruocco, Edoardo Milanetti
{"title":"Zepyros: a webserver to evaluate the shape complementarity of protein-protein interfaces.","authors":"Mattia Miotto, Lorenzo Di Rienzo, Leonardo Bo', Giancarlo Ruocco, Edoardo Milanetti","doi":"10.1093/bioadv/vbaf051","DOIUrl":"https://doi.org/10.1093/bioadv/vbaf051","url":null,"abstract":"<p><strong>Motivation: </strong>Shape complementarity of molecular surfaces at the interfaces is a well-known characteristic of protein-protein binding regions, and it is critical in influencing the stability of the complex. Measuring such complementarity is of great importance for a number of theoretical and practical implications; however, only a limited number of tools are currently available to efficiently and rapidly assess it.</p><p><strong>Results: </strong>Here, we introduce Zepyros (ZErnike Polynomials analYsis of pROtein Shapes), a webserver for fast measurement of the shape complementarity between two molecular interfaces of a given protein-protein complex using structural information. Zepyros is implemented as a publicly available tool with a user-friendly interface.</p><p><strong>Availability and implementation: </strong>Our server can be found at the following link (all major browser supported): https://zepyros.bio-groups.com.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf051"},"PeriodicalIF":2.4,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11968322/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143797200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Aggregating residue-level protein language model embeddings with optimal transport.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf060
Navid NaderiAlizadeh, Rohit Singh
{"title":"Aggregating residue-level protein language model embeddings with optimal transport.","authors":"Navid NaderiAlizadeh, Rohit Singh","doi":"10.1093/bioadv/vbaf060","DOIUrl":"10.1093/bioadv/vbaf060","url":null,"abstract":"<p><strong>Motivation: </strong>Protein language models (PLMs) have emerged as powerful approaches for mapping protein sequences into embeddings suitable for various applications. As protein representation schemes, PLMs generate per-token (i.e. per-residue) representations, resulting in variable-sized outputs based on protein length. This variability poses a challenge for protein-level prediction tasks that require uniform-sized embeddings for consistent analysis across different proteins. Previous work has typically used average pooling to summarize token-level PLM outputs, but it is unclear whether this method effectively prioritizes the relevant information across token-level representations.</p><p><strong>Results: </strong>We introduce a novel method utilizing optimal transport to convert variable-length PLM outputs into fixed-length representations. We conceptualize per-token PLM outputs as samples from a probabilistic distribution and employ sliced-Wasserstein distances to map these samples against a reference set, creating a Euclidean embedding in the output space. The resulting embedding is agnostic to the length of the input and represents the entire protein. We demonstrate the superiority of our method over average pooling for several downstream prediction tasks, particularly with constrained PLM sizes, enabling smaller-scale PLMs to match or exceed the performance of average-pooled larger-scale PLMs. Our aggregation scheme is especially effective for longer protein sequences by capturing essential information that might be lost through average pooling.</p><p><strong>Availability and implementation: </strong>Our implementation code can be found at https://github.com/navid-naderi/PLM_SWE.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf060"},"PeriodicalIF":2.4,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11961220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143765912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biological databases in the age of generative artificial intelligence.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf044
Mihai Pop, Teresa K Attwood, Judith A Blake, Philip E Bourne, Ana Conesa, Terry Gaasterland, Lawrence Hunter, Carl Kingsford, Oliver Kohlbacher, Thomas Lengauer, Scott Markel, Yves Moreau, William S Noble, Christine Orengo, B F Francis Ouellette, Laxmi Parida, Natasa Przulj, Teresa M Przytycka, Shoba Ranganathan, Russell Schwartz, Alfonso Valencia, Tandy Warnow
{"title":"Biological databases in the age of generative artificial intelligence.","authors":"Mihai Pop, Teresa K Attwood, Judith A Blake, Philip E Bourne, Ana Conesa, Terry Gaasterland, Lawrence Hunter, Carl Kingsford, Oliver Kohlbacher, Thomas Lengauer, Scott Markel, Yves Moreau, William S Noble, Christine Orengo, B F Francis Ouellette, Laxmi Parida, Natasa Przulj, Teresa M Przytycka, Shoba Ranganathan, Russell Schwartz, Alfonso Valencia, Tandy Warnow","doi":"10.1093/bioadv/vbaf044","DOIUrl":"10.1093/bioadv/vbaf044","url":null,"abstract":"<p><strong>Summary: </strong>Modern biological research critically depends on public databases. The introduction and propagation of errors within and across databases can lead to wasted resources as scientists are led astray by bad data or have to conduct expensive validation experiments. The emergence of generative artificial intelligence systems threatens to compound this problem owing to the ease with which massive volumes of synthetic data can be generated. We provide an overview of several key issues that occur within the biological data ecosystem and make several recommendations aimed at reducing data errors and their propagation. We specifically highlight the critical importance of improved educational programs aimed at biologists and life scientists that emphasize best practices in data engineering. We also argue for increased theoretical and empirical research on data provenance, error propagation, and on understanding the impact of errors on analytic pipelines. Furthermore, we recommend enhanced funding for the stewardship and maintenance of public biological databases.</p><p><strong>Availability and implementation: </strong>Not applicable.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf044"},"PeriodicalIF":2.4,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11964588/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
S2Map: a novel computational platform for identifying secretio-types through cell secretion-signal map.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-20 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf059
Zongliang Yue, Lang Zhou, Peizhen Sun, Xuejia Kang, Fengyuan Huang, Pengyu Chen
{"title":"S2Map: a novel computational platform for identifying secretio-types through cell secretion-signal map.","authors":"Zongliang Yue, Lang Zhou, Peizhen Sun, Xuejia Kang, Fengyuan Huang, Pengyu Chen","doi":"10.1093/bioadv/vbaf059","DOIUrl":"https://doi.org/10.1093/bioadv/vbaf059","url":null,"abstract":"<p><strong>Motivation: </strong>Cell communication is predominantly governed by secreted proteins, whose diverse secretion patterns often signify underlying physiological irregularities. Understanding these secreted signals at an individual cell level is crucial for gaining insights into regulatory mechanisms involving various molecular agents. To elucidate the array of cell secretion signals, which encompass different types of biomolecular secretion cues from individual immune cells, we introduce the secretion-signal map (S2Map).</p><p><strong>Results: </strong>S2Map is an online interactive analytical platform designed to explore and interpret distinct cell secretion-signal patterns visually. It incorporates two innovative qualitative metrics, the signal inequality index and the signal coverage index, which are exquisitely sensitive in measuring dissymmetry and diffusion of signals in temporal data. S2Map's innovation lies in its depiction of signals through time-series analysis with multi-layer visualization. We tested the SII and SCI performance in distinguishing the simulated signal diffusion models. S2Map hosts a repository for the single-cell's secretion-signal data for exploring cell secretio-types, a new cell phenotyping based on the cell secretion signal pattern. We anticipate that S2Map will be a powerful tool to delve into the complexities of physiological systems, providing insights into the regulation of protein production, such as cytokines at the remarkable resolution of single cells.</p><p><strong>Availability and implementation: </strong>The S2Map server is publicly accessible via https://au-s2map.streamlit.app/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf059"},"PeriodicalIF":2.4,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11972122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143797199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RNApysoforms: fast rendering interactive visualization of RNA isoform structure and expression in Python.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-14 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf057
Bernardo Aguzzoli Heberle, Madeline L Page, Emil K Gustavsson, Mina Ryten, Mark T W Ebbert
{"title":"RNApysoforms: fast rendering interactive visualization of RNA isoform structure and expression in Python.","authors":"Bernardo Aguzzoli Heberle, Madeline L Page, Emil K Gustavsson, Mina Ryten, Mark T W Ebbert","doi":"10.1093/bioadv/vbaf057","DOIUrl":"10.1093/bioadv/vbaf057","url":null,"abstract":"<p><strong>Summary: </strong>Alternative splicing generates multiple RNA isoforms from a single gene, enriching genetic diversity and impacting gene function. Effective visualization of these isoforms and their expression patterns is crucial but challenging due to limitations in existing tools. Traditional genome browsers lack programmability, while other tools offer limited customization, produce static plots, or cannot simultaneously display structures and expression levels. RNApysoforms was developed to address these gaps by providing a Python-based package that enables concurrent visualization of RNA isoform structures and expression data. Leveraging plotly and polars libraries, it offers an interactive, customizable, and faster-rendering framework suitable for web applications, enhancing the analysis and dissemination of RNA isoform research.</p><p><strong>Availability and implementation: </strong>RNApysoforms is a Python package available at (https://github.com/UK-SBCoA-EbbertLab/RNApysoforms) and (https://zenodo.org/records/14941190) via an open-source MIT license. It can be easily installed using the pip package installer for Python. Thorough documentation and usage vignettes are available at: https://rna-pysoforms.readthedocs.io/en/latest/.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf057"},"PeriodicalIF":2.4,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11964586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143775074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Challenges in predicting PROTAC-mediated protein-protein interfaces with AlphaFold reveal a general limitation on small interfaces.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-14 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf056
Gilberto P Pereira, Corentin Gouzien, Paulo C T Souza, Juliette Martin
{"title":"Challenges in predicting PROTAC-mediated protein-protein interfaces with AlphaFold reveal a general limitation on small interfaces.","authors":"Gilberto P Pereira, Corentin Gouzien, Paulo C T Souza, Juliette Martin","doi":"10.1093/bioadv/vbaf056","DOIUrl":"10.1093/bioadv/vbaf056","url":null,"abstract":"<p><strong>Motivation: </strong>Proteolysis Targeting Chimeras (PROTACs) are heterobifunctional molecules composed by ligands binding to a target protein and a E3-ligase complex, connected by a linker, that induce proximity-based target protein degradation. PROTACs are promising alternatives to conventional drugs against cancer. Predicting PROTAC-mediated complexes is often the first step for <i>in silico</i> PROTAC design pipelines. We previously noted that AlphaFold2 (AF2) fails to predict PROTAC-mediated complexes.</p><p><strong>Results: </strong>Here, we investigate the potential causes of this limitation. We consider a set of 326 protein heterodimers orthogonal to the AF2 training set, and evaluate AF2 models focusing on the interface size and presence of interface ligand. Our results show that AF2-multimer predictions are sensitive to the size of the interface to predict even in the absence of ligands, with the majority of models being incorrect for the smallest interfaces. We also benchmark both AF2 and AF3 on a set of 28 PROTAC-mediated dimers and show that AF3 does not significantly improve upon the accuracy of AF2. The low accuracy of AF2 on complexes with small interfaces has strong implications for computational pipelines for PROTAC design, as these stabilize typically small interfaces, and more generally on any prediction task that involves small interfaces.</p><p><strong>Availability and implementation: </strong>All the models analyzed in this article are available in the Zenodo archive https://zenodo.org/records/14810843.</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf056"},"PeriodicalIF":2.4,"publicationDate":"2025-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11938821/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143722845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RegionScan: a comprehensive R package for region-level genome-wide association testing with integration and visualization of multiple-variant and single-variant hypothesis testing.
IF 2.4
Bioinformatics advances Pub Date : 2025-03-13 eCollection Date: 2025-01-01 DOI: 10.1093/bioadv/vbaf052
Myriam Brossard, Delnaz Roshandel, Kexin Luo, Fatemeh Yavartanoo, Andrew D Paterson, Yun J Yoo, Shelley B Bull
{"title":"RegionScan: a comprehensive R package for region-level genome-wide association testing with integration and visualization of multiple-variant and single-variant hypothesis testing.","authors":"Myriam Brossard, Delnaz Roshandel, Kexin Luo, Fatemeh Yavartanoo, Andrew D Paterson, Yun J Yoo, Shelley B Bull","doi":"10.1093/bioadv/vbaf052","DOIUrl":"10.1093/bioadv/vbaf052","url":null,"abstract":"<p><strong>Summary: </strong>RegionScan is designed for scalable genome-wide association testing of both multiple-variant and single-variant region-level statistics, with visualization of the results. For detection of association under various regional architectures, it implements three classes of state-of-the-art region-level tests, including multiple-variant linear/logistic regression (with and without dimension reduction), a variance-component score test, and region-level min<i>P</i> tests. RegionScan also supports the analysis of multi-allelic variants and unbalanced binary phenotypes and is compatible with widely used variant call format (VCF) files for both genotyped and imputed variants. Association testing leverages linkage disequilibrium (LD) structure in pre-defined regions, for example, LD-adaptive regions obtained by genomic partitioning, and accommodates parallel processing to improve computational and memory efficiency. Detailed outputs (with allele frequencies, variant-LD bin assignment, single/joint variant effect estimates and region-level results) and utility functions are provided to assist comparison, visualization, and interpretation of results. Thus, RegionScan analysis offers valuable insights into region-level genetic architecture, which supports a wide range of potential applications.</p><p><strong>Availability and implementation: </strong>RegionScan is freely available for download on GitHub (https://github.com/brossardMyriam/RegionScan).</p>","PeriodicalId":72368,"journal":{"name":"Bioinformatics advances","volume":"5 1","pages":"vbaf052"},"PeriodicalIF":2.4,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11951254/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143756193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信