基于RDBMS和NOSQL的转录组数据结构和处理混合技术

Q3 Mathematics
A. M. Mukhin, M. Genaev, D. Rasskazov, S. Lashin, D. Afonnikov
{"title":"基于RDBMS和NOSQL的转录组数据结构和处理混合技术","authors":"A. M. Mukhin, M. Genaev, D. Rasskazov, S. Lashin, D. Afonnikov","doi":"10.17537/2020.15.455","DOIUrl":null,"url":null,"abstract":"\nThe transcriptome sequencing experiment (RNA-seq) has become almost a routine procedure for studying both model organisms and crops. As a result of bioinformatics processing of such experimental output, huge heterogeneous data are obtained, representing nucleotide sequences of transcripts, amino acid sequences, and their structural and functional annotation. It is important to present the data obtained to a wide range of researchers in the form of databases. This article proposes a hybrid approach to creating molecular genetic databases that contain information about transcript sequences and their structural and functional annotation. The essence of the approach consists in the simultaneous storing both structured and weakly structured data in the database. The technology was used to implement a database of transcriptomes of agricultural plants. This paper discusses the features of implementing this approach and examples of generating both simple and complex queries to such a database in the SQL language. The OORT database is freely available at https://oort.cytogen.ru/.\n","PeriodicalId":53525,"journal":{"name":"Mathematical Biology and Bioinformatics","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"RDBMS and NOSQL Based Hybrid Technology for Transcriptome Data Structuring and Processing\",\"authors\":\"A. M. Mukhin, M. Genaev, D. Rasskazov, S. Lashin, D. Afonnikov\",\"doi\":\"10.17537/2020.15.455\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\nThe transcriptome sequencing experiment (RNA-seq) has become almost a routine procedure for studying both model organisms and crops. As a result of bioinformatics processing of such experimental output, huge heterogeneous data are obtained, representing nucleotide sequences of transcripts, amino acid sequences, and their structural and functional annotation. It is important to present the data obtained to a wide range of researchers in the form of databases. This article proposes a hybrid approach to creating molecular genetic databases that contain information about transcript sequences and their structural and functional annotation. The essence of the approach consists in the simultaneous storing both structured and weakly structured data in the database. The technology was used to implement a database of transcriptomes of agricultural plants. This paper discusses the features of implementing this approach and examples of generating both simple and complex queries to such a database in the SQL language. The OORT database is freely available at https://oort.cytogen.ru/.\\n\",\"PeriodicalId\":53525,\"journal\":{\"name\":\"Mathematical Biology and Bioinformatics\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Mathematical Biology and Bioinformatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17537/2020.15.455\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Mathematical Biology and Bioinformatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17537/2020.15.455","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1

摘要

转录组测序实验(RNA-seq)几乎已成为研究模式生物和作物的常规方法。对这些实验输出进行生物信息学处理,获得了大量异构数据,包括转录本的核苷酸序列、氨基酸序列及其结构和功能注释。将获得的数据以数据库的形式呈现给广泛的研究人员是很重要的。本文提出了一种混合方法来创建包含转录序列及其结构和功能注释信息的分子遗传数据库。该方法的本质在于同时在数据库中存储结构化和弱结构化数据。该技术被用于建立农业植物转录组数据库。本文讨论了实现这种方法的特点,以及用SQL语言为这种数据库生成简单和复杂查询的示例。OORT数据库可在https://oort.cytogen.ru/免费获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
RDBMS and NOSQL Based Hybrid Technology for Transcriptome Data Structuring and Processing
The transcriptome sequencing experiment (RNA-seq) has become almost a routine procedure for studying both model organisms and crops. As a result of bioinformatics processing of such experimental output, huge heterogeneous data are obtained, representing nucleotide sequences of transcripts, amino acid sequences, and their structural and functional annotation. It is important to present the data obtained to a wide range of researchers in the form of databases. This article proposes a hybrid approach to creating molecular genetic databases that contain information about transcript sequences and their structural and functional annotation. The essence of the approach consists in the simultaneous storing both structured and weakly structured data in the database. The technology was used to implement a database of transcriptomes of agricultural plants. This paper discusses the features of implementing this approach and examples of generating both simple and complex queries to such a database in the SQL language. The OORT database is freely available at https://oort.cytogen.ru/.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Mathematical Biology and Bioinformatics
Mathematical Biology and Bioinformatics Mathematics-Applied Mathematics
CiteScore
1.10
自引率
0.00%
发文量
13
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信