M. Lu, Du Zhang, Hongjun Xu, Ken Tse-yau Lau, Li Lu
{"title":"Protein secondary structure prediction using data mining tool C5","authors":"M. Lu, Du Zhang, Hongjun Xu, Ken Tse-yau Lau, Li Lu","doi":"10.1109/TAI.1999.809774","DOIUrl":null,"url":null,"abstract":"This paper reports our experimental results in protein secondary structure prediction using the machine learning software, C5. The accuracy improvement in the prediction of protein secondary structure is the focus of our study. Starting with a target protein with unknown secondary structures, we investigate three different approaches and find that training cases selected based on sequence homology can achieve the highest predictive accuracy of 75% in testing cases. Our result indicates that the method of selecting proteins for the training cases has the most significant impact on predictive accuracy.","PeriodicalId":194023,"journal":{"name":"Proceedings 11th International Conference on Tools with Artificial Intelligence","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 11th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAI.1999.809774","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
This paper reports our experimental results in protein secondary structure prediction using the machine learning software, C5. The accuracy improvement in the prediction of protein secondary structure is the focus of our study. Starting with a target protein with unknown secondary structures, we investigate three different approaches and find that training cases selected based on sequence homology can achieve the highest predictive accuracy of 75% in testing cases. Our result indicates that the method of selecting proteins for the training cases has the most significant impact on predictive accuracy.