{"title":"A Useful Tool for the Identification of DNA-binding Proteins Using Graph Convolutional Network","authors":"Dasheng Chen, Leyi Wei","doi":"10.2174/1570164618999201210225354","DOIUrl":null,"url":null,"abstract":"\n\nBoth DNAs and proteins are important components of living organisms. DNA-binding proteins are\na kind of helicase, which is a protein specifically responsible for binding to DNA single stranded regions. It plays a key role\nin the function of various biomolecules. Although there are some prediction methods for the DNA-binding proteins sequences,\nthe use of graph neural networks in this research is still limited.\n\n\n\nIn this article, using graph neural networks, we developed a novel predictor GCN-DBP for protein classification\nprediction.\n\n\n\nEach protein sequence is treated as a document in this study, and then document is segmented according to the\nconcept of k-mer. This research aims to use document word relationships and word co-occurrence as a corpus to construct a\ntext graph. Then, the predictor learns protein sequence information by two-layer graph convolutional networks.\n\n\n\nIn order to compare the proposed method with other four existing methods, we have conducted more experiments.\nFinally, we tested GCN-DBP on the independent data set PDB2272. Its accuracy reached 64.17% and MCC reached\n28.32%.\n\n\n\nThe results show that the proposed method is superior to the other four methods and will be a useful tool for\nprotein classification.\n","PeriodicalId":50601,"journal":{"name":"Current Proteomics","volume":"11 1","pages":""},"PeriodicalIF":0.5000,"publicationDate":"2020-12-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Current Proteomics","FirstCategoryId":"99","ListUrlMain":"https://doi.org/10.2174/1570164618999201210225354","RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BIOCHEMICAL RESEARCH METHODS","Score":null,"Total":0}
引用次数: 2
Abstract
Both DNAs and proteins are important components of living organisms. DNA-binding proteins are
a kind of helicase, which is a protein specifically responsible for binding to DNA single stranded regions. It plays a key role
in the function of various biomolecules. Although there are some prediction methods for the DNA-binding proteins sequences,
the use of graph neural networks in this research is still limited.
In this article, using graph neural networks, we developed a novel predictor GCN-DBP for protein classification
prediction.
Each protein sequence is treated as a document in this study, and then document is segmented according to the
concept of k-mer. This research aims to use document word relationships and word co-occurrence as a corpus to construct a
text graph. Then, the predictor learns protein sequence information by two-layer graph convolutional networks.
In order to compare the proposed method with other four existing methods, we have conducted more experiments.
Finally, we tested GCN-DBP on the independent data set PDB2272. Its accuracy reached 64.17% and MCC reached
28.32%.
The results show that the proposed method is superior to the other four methods and will be a useful tool for
protein classification.
Current ProteomicsBIOCHEMICAL RESEARCH METHODS-BIOCHEMISTRY & MOLECULAR BIOLOGY
CiteScore
1.60
自引率
0.00%
发文量
25
审稿时长
>0 weeks
期刊介绍:
Research in the emerging field of proteomics is growing at an extremely rapid rate. The principal aim of Current Proteomics is to publish well-timed in-depth/mini review articles in this fast-expanding area on topics relevant and significant to the development of proteomics. Current Proteomics is an essential journal for everyone involved in proteomics and related fields in both academia and industry.
Current Proteomics publishes in-depth/mini review articles in all aspects of the fast-expanding field of proteomics. All areas of proteomics are covered together with the methodology, software, databases, technological advances and applications of proteomics, including functional proteomics. Diverse technologies covered include but are not limited to:
Protein separation and characterization techniques
2-D gel electrophoresis and image analysis
Techniques for protein expression profiling including mass spectrometry-based methods and algorithms for correlative database searching
Determination of co-translational and post- translational modification of proteins
Protein/peptide microarrays
Biomolecular interaction analysis
Analysis of protein complexes
Yeast two-hybrid projects
Protein-protein interaction (protein interactome) pathways and cell signaling networks
Systems biology
Proteome informatics (bioinformatics)
Knowledge integration and management tools
High-throughput protein structural studies (using mass spectrometry, nuclear magnetic resonance and X-ray crystallography)
High-throughput computational methods for protein 3-D structure as well as function determination
Robotics, nanotechnology, and microfluidics.