{"title":"数据表示影响人工神经网络对蛋白质二级结构的预测","authors":"O. Lamont, H. Liang, M. Bellgard","doi":"10.1109/ANZIIS.2001.974114","DOIUrl":null,"url":null,"abstract":"Artificial Neural Networks (ANN) have been used very successfully for a number of classification problems in the molecular biology field. Protein secondary structure prediction is one of the oldest and best defined of these classification problems. Yet despite the considerable amount of work conducted in this field there still remain a number of fundamental computational issues that have not been thoroughly investigated, if considered at all. One important issue is identifying an appropriate data representation for input into the ANN. In this paper, we have investigated a range of new encoding schemes and evaluated their performance using recently introduced evaluation criterion. We have done this by preserving the redundant information of DNA codons that is lost when they are translated into amino acids. Interestingly, with our new data representation, the /spl beta/-strand prediction performance was consistently higher (14% improvement) over the accuracy of the ANNs trained when the conventional representation was used.","PeriodicalId":383878,"journal":{"name":"The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001","volume":"91 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Data representation influences protein secondary structure prediction using artificial neural networks\",\"authors\":\"O. Lamont, H. Liang, M. Bellgard\",\"doi\":\"10.1109/ANZIIS.2001.974114\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Artificial Neural Networks (ANN) have been used very successfully for a number of classification problems in the molecular biology field. Protein secondary structure prediction is one of the oldest and best defined of these classification problems. Yet despite the considerable amount of work conducted in this field there still remain a number of fundamental computational issues that have not been thoroughly investigated, if considered at all. One important issue is identifying an appropriate data representation for input into the ANN. In this paper, we have investigated a range of new encoding schemes and evaluated their performance using recently introduced evaluation criterion. We have done this by preserving the redundant information of DNA codons that is lost when they are translated into amino acids. Interestingly, with our new data representation, the /spl beta/-strand prediction performance was consistently higher (14% improvement) over the accuracy of the ANNs trained when the conventional representation was used.\",\"PeriodicalId\":383878,\"journal\":{\"name\":\"The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001\",\"volume\":\"91 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ANZIIS.2001.974114\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Seventh Australian and New Zealand Intelligent Information Systems Conference, 2001","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ANZIIS.2001.974114","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data representation influences protein secondary structure prediction using artificial neural networks
Artificial Neural Networks (ANN) have been used very successfully for a number of classification problems in the molecular biology field. Protein secondary structure prediction is one of the oldest and best defined of these classification problems. Yet despite the considerable amount of work conducted in this field there still remain a number of fundamental computational issues that have not been thoroughly investigated, if considered at all. One important issue is identifying an appropriate data representation for input into the ANN. In this paper, we have investigated a range of new encoding schemes and evaluated their performance using recently introduced evaluation criterion. We have done this by preserving the redundant information of DNA codons that is lost when they are translated into amino acids. Interestingly, with our new data representation, the /spl beta/-strand prediction performance was consistently higher (14% improvement) over the accuracy of the ANNs trained when the conventional representation was used.