H. Areiza-Laverde, L. R. Mercado-Diaz, A. E. Castro-Ospina, J. A. Jaramillo-Garzón
{"title":"基于图表示和机器学习方法的蛋白质折叠族预测","authors":"H. Areiza-Laverde, L. R. Mercado-Diaz, A. E. Castro-Ospina, J. A. Jaramillo-Garzón","doi":"10.1109/STSIVA.2016.7743298","DOIUrl":null,"url":null,"abstract":"Prediction of protein fold families remains an existing challenge in molecular biology and bioinformatics, mainly because proteins form a broad range of complex three-dimensional configurations and because the number of proteins registered in datasets has dramatically increased in the recent years. Computational alternatives must then be designed for substituting experimental methods. However, implementations of computational methods have found a problem to extract features that involve the physical-chemical attributes and spatial features of the protein to improve the accuracy in predictions. In this paper, we propose the use of graph theory for representing position of amino acids of the protein as graph nodes, and graph edges connect amino acids that are close to each other under a given threshold. In this way we can get very descriptive features related to spatial and physical-chemical properties of the proteins to describe their three-dimensional structure and so predict the protein fold families with a good accuracy.","PeriodicalId":373420,"journal":{"name":"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Protein fold families prediction based on graph representations and machine learning methods\",\"authors\":\"H. Areiza-Laverde, L. R. Mercado-Diaz, A. E. Castro-Ospina, J. A. Jaramillo-Garzón\",\"doi\":\"10.1109/STSIVA.2016.7743298\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Prediction of protein fold families remains an existing challenge in molecular biology and bioinformatics, mainly because proteins form a broad range of complex three-dimensional configurations and because the number of proteins registered in datasets has dramatically increased in the recent years. Computational alternatives must then be designed for substituting experimental methods. However, implementations of computational methods have found a problem to extract features that involve the physical-chemical attributes and spatial features of the protein to improve the accuracy in predictions. In this paper, we propose the use of graph theory for representing position of amino acids of the protein as graph nodes, and graph edges connect amino acids that are close to each other under a given threshold. In this way we can get very descriptive features related to spatial and physical-chemical properties of the proteins to describe their three-dimensional structure and so predict the protein fold families with a good accuracy.\",\"PeriodicalId\":373420,\"journal\":{\"name\":\"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/STSIVA.2016.7743298\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 XXI Symposium on Signal Processing, Images and Artificial Vision (STSIVA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/STSIVA.2016.7743298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Protein fold families prediction based on graph representations and machine learning methods
Prediction of protein fold families remains an existing challenge in molecular biology and bioinformatics, mainly because proteins form a broad range of complex three-dimensional configurations and because the number of proteins registered in datasets has dramatically increased in the recent years. Computational alternatives must then be designed for substituting experimental methods. However, implementations of computational methods have found a problem to extract features that involve the physical-chemical attributes and spatial features of the protein to improve the accuracy in predictions. In this paper, we propose the use of graph theory for representing position of amino acids of the protein as graph nodes, and graph edges connect amino acids that are close to each other under a given threshold. In this way we can get very descriptive features related to spatial and physical-chemical properties of the proteins to describe their three-dimensional structure and so predict the protein fold families with a good accuracy.