Mete Orhun Minbay, Richard Sun, Vijay Ramachandran, Ahmet Ay, Tamer Kahveci
{"title":"OLTA: Optimizing bait seLection for TArgeted sequencing.","authors":"Mete Orhun Minbay, Richard Sun, Vijay Ramachandran, Ahmet Ay, Tamer Kahveci","doi":"10.1093/bioinformatics/btaf146","DOIUrl":null,"url":null,"abstract":"<p><strong>Motivation: </strong>Targeted enrichment via capture probes, also known as baits, is a promising complementary procedure for next-generation sequencing methods. This technique uses short biotinylated oligonucleotide probes that hybridize with complementary genetic material in a sample. Following hybridization, the target fragments can be easily isolated and processed with minimal contamination from irrelevant material. Designing an efficient set of baits for a set of target sequences, however, is an NP-hard problem.</p><p><strong>Results: </strong>We develop a novel heuristic algorithm that leverages the similarities between the characteristics of the Minimum Bait Cover and the Closest String problems to reduce the number of baits to cover a given target sequence. Our results on real and synthetic datasets demonstrate that our algorithm, OLTA produces fewest baits for nearly all experimental settings and datasets. On average, it produces 6% and 11% fewer baits than the next best state-of-the-art methods for two major real datasets, AIV and MEGARES. Also, its bait set has the highest utilization and the minimum redundancy.</p><p><strong>Availability and implementation: </strong>Our algorithm is available at github.com/FuelTheBurn/OLTA-Optimizing-bait-seLection-for-TArgeted-sequencing. Test data and other software are archived at doi.org/10.5281/zenodo.15086636.</p>","PeriodicalId":93899,"journal":{"name":"Bioinformatics (Oxford, England)","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12033030/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Bioinformatics (Oxford, England)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1093/bioinformatics/btaf146","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Motivation: Targeted enrichment via capture probes, also known as baits, is a promising complementary procedure for next-generation sequencing methods. This technique uses short biotinylated oligonucleotide probes that hybridize with complementary genetic material in a sample. Following hybridization, the target fragments can be easily isolated and processed with minimal contamination from irrelevant material. Designing an efficient set of baits for a set of target sequences, however, is an NP-hard problem.
Results: We develop a novel heuristic algorithm that leverages the similarities between the characteristics of the Minimum Bait Cover and the Closest String problems to reduce the number of baits to cover a given target sequence. Our results on real and synthetic datasets demonstrate that our algorithm, OLTA produces fewest baits for nearly all experimental settings and datasets. On average, it produces 6% and 11% fewer baits than the next best state-of-the-art methods for two major real datasets, AIV and MEGARES. Also, its bait set has the highest utilization and the minimum redundancy.
Availability and implementation: Our algorithm is available at github.com/FuelTheBurn/OLTA-Optimizing-bait-seLection-for-TArgeted-sequencing. Test data and other software are archived at doi.org/10.5281/zenodo.15086636.