{"title":"RNA二级结构预测的参数化随机语法","authors":"Robert S. Maier","doi":"10.1109/ITA.2007.4357589","DOIUrl":null,"url":null,"abstract":"We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced- parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.","PeriodicalId":439952,"journal":{"name":"2007 Information Theory and Applications Workshop","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2007-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Parametrized Stochastic Grammars for RNA Secondary Structure Prediction\",\"authors\":\"Robert S. Maier\",\"doi\":\"10.1109/ITA.2007.4357589\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced- parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.\",\"PeriodicalId\":439952,\"journal\":{\"name\":\"2007 Information Theory and Applications Workshop\",\"volume\":\"131 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2007-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2007 Information Theory and Applications Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITA.2007.4357589\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2007 Information Theory and Applications Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA.2007.4357589","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parametrized Stochastic Grammars for RNA Secondary Structure Prediction
We propose a two-level stochastic context-free grammar (SCFG) architecture for parametrized stochastic modeling of a family of RNA sequences, including their secondary structure. A stochastic model of this type can be used for maximum a posteriori estimation of the secondary structure of any new sequence in the family. The proposed SCFG architecture models RNA subsequences comprising paired bases as stochastically weighted Dyck-language words, i.e., as weighted balanced- parenthesis expressions. The length of each run of unpaired bases, forming a loop or a bulge, is taken to have a phase-type distribution: that of the hitting time in a finite-state Markov chain. Without loss of generality, each such Markov chain can be taken to have a bounded complexity. The scheme yields an overall family SCFG with a manageable number of parameters.