{"title":"Recognizing Protein Secondary Structures with Neural Networks","authors":"R. Harrison, Michael McDermott, Chinua Umoja","doi":"10.1109/DEXA.2017.29","DOIUrl":null,"url":null,"abstract":"Recognizing secondary structures in proteins can be a highly computationally expensive task that may not always yield good results. Using Restricted Boltzmann Machines (RBM) we were able to train a simple neural network to recognize an alpha-helix with a good degree of accuracy. Modifying the RBM implementation to be much simpler and more efficient than the standard implementation we are able to see a 14-fold speedup in training with no loss in detection accuracy or in cluster formation. With even very small training sets (160 members) we are able to recognize both the alpha-helix structures we are training for but also other, similar, helix structures that we did not train for. We are also able to recognize these structures with a high degree of accuracy. We are also able to cluster these structures together in a meaningful way based on the RBM training results. Both the training and clustering is completely unsupervised beyond the training set meeting certain constraints. Interestingly, each cluster shares structural similarities within itself but also has noticeable differences from other clusters that are detected. These clusters seem to form regardless of training set size or makeup.","PeriodicalId":127009,"journal":{"name":"2017 28th International Workshop on Database and Expert Systems Applications (DEXA)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 28th International Workshop on Database and Expert Systems Applications (DEXA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEXA.2017.29","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Recognizing secondary structures in proteins can be a highly computationally expensive task that may not always yield good results. Using Restricted Boltzmann Machines (RBM) we were able to train a simple neural network to recognize an alpha-helix with a good degree of accuracy. Modifying the RBM implementation to be much simpler and more efficient than the standard implementation we are able to see a 14-fold speedup in training with no loss in detection accuracy or in cluster formation. With even very small training sets (160 members) we are able to recognize both the alpha-helix structures we are training for but also other, similar, helix structures that we did not train for. We are also able to recognize these structures with a high degree of accuracy. We are also able to cluster these structures together in a meaningful way based on the RBM training results. Both the training and clustering is completely unsupervised beyond the training set meeting certain constraints. Interestingly, each cluster shares structural similarities within itself but also has noticeable differences from other clusters that are detected. These clusters seem to form regardless of training set size or makeup.