Sang-Min Lee, H. Tak, Kiejung Park, Hwan-Gue Cho, Do-Hoon Lee
{"title":"使用SAM模板的RNA-Seq读取模拟器","authors":"Sang-Min Lee, H. Tak, Kiejung Park, Hwan-Gue Cho, Do-Hoon Lee","doi":"10.1109/ICITCS.2013.6717877","DOIUrl":null,"url":null,"abstract":"Sequencing technologies, which generate read segments from reference genes, have been diversified significantly with the introduction of the Next Generation Sequencer. Despite of its efficiency in terms of time and cost compared to the previous one, it is still too expensive to conduct a bunch of experiments consequently or to reflect particular biological specificity in the experimental settings. To deal with this problem, there have been developed some simulators that generates reads reflecting specific biological characteristics. However, there is still a lack of the consideration of some other important statistical quantities such as gene expression levels in read simulation. After giving a brief review on state-of-the-art read simulators focusing on their sequencing method and functional characteristics, this paper presents a new read simulation method considering gene expression structures. The proposed method extracts the statistical information from SAM files that contain read mapping results, and generates synthetic reads having the analyzed characteristics. We also demonstrate the effectiveness of the proposed method by comparing simulated data with the real data.","PeriodicalId":420227,"journal":{"name":"2013 International Conference on IT Convergence and Security (ICITCS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"RNA-Seq Read Simulator Using SAM Template\",\"authors\":\"Sang-Min Lee, H. Tak, Kiejung Park, Hwan-Gue Cho, Do-Hoon Lee\",\"doi\":\"10.1109/ICITCS.2013.6717877\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Sequencing technologies, which generate read segments from reference genes, have been diversified significantly with the introduction of the Next Generation Sequencer. Despite of its efficiency in terms of time and cost compared to the previous one, it is still too expensive to conduct a bunch of experiments consequently or to reflect particular biological specificity in the experimental settings. To deal with this problem, there have been developed some simulators that generates reads reflecting specific biological characteristics. However, there is still a lack of the consideration of some other important statistical quantities such as gene expression levels in read simulation. After giving a brief review on state-of-the-art read simulators focusing on their sequencing method and functional characteristics, this paper presents a new read simulation method considering gene expression structures. The proposed method extracts the statistical information from SAM files that contain read mapping results, and generates synthetic reads having the analyzed characteristics. We also demonstrate the effectiveness of the proposed method by comparing simulated data with the real data.\",\"PeriodicalId\":420227,\"journal\":{\"name\":\"2013 International Conference on IT Convergence and Security (ICITCS)\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on IT Convergence and Security (ICITCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICITCS.2013.6717877\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on IT Convergence and Security (ICITCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICITCS.2013.6717877","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Sequencing technologies, which generate read segments from reference genes, have been diversified significantly with the introduction of the Next Generation Sequencer. Despite of its efficiency in terms of time and cost compared to the previous one, it is still too expensive to conduct a bunch of experiments consequently or to reflect particular biological specificity in the experimental settings. To deal with this problem, there have been developed some simulators that generates reads reflecting specific biological characteristics. However, there is still a lack of the consideration of some other important statistical quantities such as gene expression levels in read simulation. After giving a brief review on state-of-the-art read simulators focusing on their sequencing method and functional characteristics, this paper presents a new read simulation method considering gene expression structures. The proposed method extracts the statistical information from SAM files that contain read mapping results, and generates synthetic reads having the analyzed characteristics. We also demonstrate the effectiveness of the proposed method by comparing simulated data with the real data.