Lia Obinu, Timothy Booth, Heleen De Weerd, Urmi Trivedi, Andrea Porceddu
{"title":"Colora: A Snakemake Workflow for Complete Chromosome-scale De Novo Genome Assembly","authors":"Lia Obinu, Timothy Booth, Heleen De Weerd, Urmi Trivedi, Andrea Porceddu","doi":"10.1101/2024.09.10.612003","DOIUrl":null,"url":null,"abstract":"Background De novo assembly creates reference genomes that underpin many modern biodiversity and conservation studies. Large numbers of new genomes are being assembled by labs around the world. To avoid duplication of efforts and variable data quality, we desire a best-practice assembly process, implemented as an automated portable workflow. Results Here we present Colora, a Snakemake workflow that produces chromosome-scale de novo primary or phased genome assemblies complete with organelles using PacBio HiFi, Hi-C, and optionally ONT reads as input. The source code of Colora is available on GitHub: https://github.com/LiaOb21/colora. Colora is also available at the Snakemake Workflow Catalog (https://snakemake.github.io/snakemake-workflow-catalog/?usage=LiaOb21%2Fcolora). Conclusions Colora is a user-friendly, versatile, and reproducible pipeline that is ready to use by researchers looking for an automated way to obtain high-quality de novo genome assemblies.","PeriodicalId":501161,"journal":{"name":"bioRxiv - Genomics","volume":"210 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"bioRxiv - Genomics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1101/2024.09.10.612003","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background De novo assembly creates reference genomes that underpin many modern biodiversity and conservation studies. Large numbers of new genomes are being assembled by labs around the world. To avoid duplication of efforts and variable data quality, we desire a best-practice assembly process, implemented as an automated portable workflow. Results Here we present Colora, a Snakemake workflow that produces chromosome-scale de novo primary or phased genome assemblies complete with organelles using PacBio HiFi, Hi-C, and optionally ONT reads as input. The source code of Colora is available on GitHub: https://github.com/LiaOb21/colora. Colora is also available at the Snakemake Workflow Catalog (https://snakemake.github.io/snakemake-workflow-catalog/?usage=LiaOb21%2Fcolora). Conclusions Colora is a user-friendly, versatile, and reproducible pipeline that is ready to use by researchers looking for an automated way to obtain high-quality de novo genome assemblies.