Ayaan Hossain, Daniel P Cetnar, Travis L LaFleur, James R McLellan, Howard M Salis
{"title":"Automated Design of Oligopools and Rapid Analysis of Massively Parallel Barcoded Measurements.","authors":"Ayaan Hossain, Daniel P Cetnar, Travis L LaFleur, James R McLellan, Howard M Salis","doi":"10.1021/acssynbio.4c00661","DOIUrl":null,"url":null,"abstract":"<p><p>Oligopool synthesis and next-generation sequencing enable the construction and characterization of large libraries of designed genetic parts and systems. As library sizes grow, it becomes computationally challenging to optimally design large numbers of primer binding sites, barcode sequences, and overlap regions to obtain efficient assemblies and precise measurements. We present the Oligopool Calculator, an end-to-end suite of algorithms and data structures that rapidly designs many thousands of oligonucleotides within an oligopool and rapidly analyzes many billions of barcoded sequencing reads. We introduce several novel concepts that greatly increase the design and analysis throughput, including orthogonally symmetric barcode design, adaptive decision trees for primer design, a Scry barcode classifier, and efficient read packing. We demonstrate the Oligopool Calculator's capabilities across computational benchmarks and real-data projects, including the design of over four million highly unique and compact barcodes in 1.2 h, the design of universal primer binding sites for one million 200-mer oligos in 15 min, and the analysis of about 500 million deep sequencing reads per hour, all on an 8-core desktop computer. Overall, the Oligopool Calculator accelerates the creative use of massively parallel experiments by eliminating the computational complexity of their design and analysis.</p>","PeriodicalId":26,"journal":{"name":"ACS Synthetic Biology","volume":" ","pages":"4218-4232"},"PeriodicalIF":3.7000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Synthetic Biology","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.1021/acssynbio.4c00661","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/6 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
Oligopool synthesis and next-generation sequencing enable the construction and characterization of large libraries of designed genetic parts and systems. As library sizes grow, it becomes computationally challenging to optimally design large numbers of primer binding sites, barcode sequences, and overlap regions to obtain efficient assemblies and precise measurements. We present the Oligopool Calculator, an end-to-end suite of algorithms and data structures that rapidly designs many thousands of oligonucleotides within an oligopool and rapidly analyzes many billions of barcoded sequencing reads. We introduce several novel concepts that greatly increase the design and analysis throughput, including orthogonally symmetric barcode design, adaptive decision trees for primer design, a Scry barcode classifier, and efficient read packing. We demonstrate the Oligopool Calculator's capabilities across computational benchmarks and real-data projects, including the design of over four million highly unique and compact barcodes in 1.2 h, the design of universal primer binding sites for one million 200-mer oligos in 15 min, and the analysis of about 500 million deep sequencing reads per hour, all on an 8-core desktop computer. Overall, the Oligopool Calculator accelerates the creative use of massively parallel experiments by eliminating the computational complexity of their design and analysis.
期刊介绍:
The journal is particularly interested in studies on the design and synthesis of new genetic circuits and gene products; computational methods in the design of systems; and integrative applied approaches to understanding disease and metabolism.
Topics may include, but are not limited to:
Design and optimization of genetic systems
Genetic circuit design and their principles for their organization into programs
Computational methods to aid the design of genetic systems
Experimental methods to quantify genetic parts, circuits, and metabolic fluxes
Genetic parts libraries: their creation, analysis, and ontological representation
Protein engineering including computational design
Metabolic engineering and cellular manufacturing, including biomass conversion
Natural product access, engineering, and production
Creative and innovative applications of cellular programming
Medical applications, tissue engineering, and the programming of therapeutic cells
Minimal cell design and construction
Genomics and genome replacement strategies
Viral engineering
Automated and robotic assembly platforms for synthetic biology
DNA synthesis methodologies
Metagenomics and synthetic metagenomic analysis
Bioinformatics applied to gene discovery, chemoinformatics, and pathway construction
Gene optimization
Methods for genome-scale measurements of transcription and metabolomics
Systems biology and methods to integrate multiple data sources
in vitro and cell-free synthetic biology and molecular programming
Nucleic acid engineering.