{"title":"基因网络重构中的数据稀疏性处理","authors":"G. B. Bezerra, T.V. Barra, F. V. Zuben, L. Castro","doi":"10.1109/CIBCB.2005.1594900","DOIUrl":null,"url":null,"abstract":"One of the main problems related to regulatory network reconstruction from expression data concerns the small size and low quality of the available dataset. When trying to infer a model from little information it is necessary to give much more precedence to generalization, rather than specificity, otherwise, any attempt will be fated to overfitting. In this paper we address this issue by focusing on data sparseness and noisy information, and propose a density estimation technique that achieves regularized curves when data is scarce. We first compare the proposed method with the EM algorithm for mixture models on density estimation problems. Next, we apply the method, together with Bayesian networks, on realistic simulations of static gene networks, and compare the obtained results with the standard discrete Bayesian network model. We intend to demonstrate that adopting a discrete approach is not justifiable when only a small amount of information is available.","PeriodicalId":330810,"journal":{"name":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Handling Data Sparseness in Gene Network Reconstruction\",\"authors\":\"G. B. Bezerra, T.V. Barra, F. V. Zuben, L. Castro\",\"doi\":\"10.1109/CIBCB.2005.1594900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One of the main problems related to regulatory network reconstruction from expression data concerns the small size and low quality of the available dataset. When trying to infer a model from little information it is necessary to give much more precedence to generalization, rather than specificity, otherwise, any attempt will be fated to overfitting. In this paper we address this issue by focusing on data sparseness and noisy information, and propose a density estimation technique that achieves regularized curves when data is scarce. We first compare the proposed method with the EM algorithm for mixture models on density estimation problems. Next, we apply the method, together with Bayesian networks, on realistic simulations of static gene networks, and compare the obtained results with the standard discrete Bayesian network model. We intend to demonstrate that adopting a discrete approach is not justifiable when only a small amount of information is available.\",\"PeriodicalId\":330810,\"journal\":{\"name\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CIBCB.2005.1594900\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIBCB.2005.1594900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Handling Data Sparseness in Gene Network Reconstruction
One of the main problems related to regulatory network reconstruction from expression data concerns the small size and low quality of the available dataset. When trying to infer a model from little information it is necessary to give much more precedence to generalization, rather than specificity, otherwise, any attempt will be fated to overfitting. In this paper we address this issue by focusing on data sparseness and noisy information, and propose a density estimation technique that achieves regularized curves when data is scarce. We first compare the proposed method with the EM algorithm for mixture models on density estimation problems. Next, we apply the method, together with Bayesian networks, on realistic simulations of static gene networks, and compare the obtained results with the standard discrete Bayesian network model. We intend to demonstrate that adopting a discrete approach is not justifiable when only a small amount of information is available.