Fahim Mohammad, R. Flight, Benjamin J. Harrison, J. Petruska, E. Rouchka
{"title":"用于重叠遗传实体检测的区间树","authors":"Fahim Mohammad, R. Flight, Benjamin J. Harrison, J. Petruska, E. Rouchka","doi":"10.1109/BIBE.2011.49","DOIUrl":null,"url":null,"abstract":"A variety of systems exist in which annotations are available at various levels of granularity to a reference coordinate system, such as roads and landmarks on a map, features within a 2-dimensional or 3-dimensional image, or genetic entities (GEs) mapped to a reference genome. As the number of annotations grows, methods to efficiently locate overlapping entities within a specific interval of interest are needed. In this paper, the efficiency of using interval trees for storing, maintaining, and querying large numbers of intervals with special attention to genetic entities is demonstrated. The results suggest a significant speed -- up when compared to relational database approaches. As such, interval trees serve as a suitable alternative for storing and searching annotations to a reference coordinate system.","PeriodicalId":391184,"journal":{"name":"2011 IEEE 11th International Conference on Bioinformatics and Bioengineering","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Interval Trees for Detection of Overlapping Genetic Entities\",\"authors\":\"Fahim Mohammad, R. Flight, Benjamin J. Harrison, J. Petruska, E. Rouchka\",\"doi\":\"10.1109/BIBE.2011.49\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A variety of systems exist in which annotations are available at various levels of granularity to a reference coordinate system, such as roads and landmarks on a map, features within a 2-dimensional or 3-dimensional image, or genetic entities (GEs) mapped to a reference genome. As the number of annotations grows, methods to efficiently locate overlapping entities within a specific interval of interest are needed. In this paper, the efficiency of using interval trees for storing, maintaining, and querying large numbers of intervals with special attention to genetic entities is demonstrated. The results suggest a significant speed -- up when compared to relational database approaches. As such, interval trees serve as a suitable alternative for storing and searching annotations to a reference coordinate system.\",\"PeriodicalId\":391184,\"journal\":{\"name\":\"2011 IEEE 11th International Conference on Bioinformatics and Bioengineering\",\"volume\":\"44 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-10-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE 11th International Conference on Bioinformatics and Bioengineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIBE.2011.49\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 11th International Conference on Bioinformatics and Bioengineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBE.2011.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Interval Trees for Detection of Overlapping Genetic Entities
A variety of systems exist in which annotations are available at various levels of granularity to a reference coordinate system, such as roads and landmarks on a map, features within a 2-dimensional or 3-dimensional image, or genetic entities (GEs) mapped to a reference genome. As the number of annotations grows, methods to efficiently locate overlapping entities within a specific interval of interest are needed. In this paper, the efficiency of using interval trees for storing, maintaining, and querying large numbers of intervals with special attention to genetic entities is demonstrated. The results suggest a significant speed -- up when compared to relational database approaches. As such, interval trees serve as a suitable alternative for storing and searching annotations to a reference coordinate system.