IPSJ Transactions on Bioinformatics最新文献_第6页

Sparse Learner Boosting for Gene Expression Data 基因表达数据的稀疏学习器增强

IPSJ Transactions on Bioinformatics Pub Date : 2010-01-01 DOI: 10.2197/IPSJTBIO.3.54

M. Pritchard

引用次数: 1

Support vector machine prediction of N-and O-glycosylation sites using whole sequence information and subcellular localization 基于全序列信息和亚细胞定位的支持向量机预测n和o糖基化位点

IPSJ Transactions on Bioinformatics Pub Date : 2009-12-01 DOI: 10.2197/IPSJTBIO.2.25

Kenta Sasaki, Nobuyoshi Nagamine, Y. Sakakibara

{"title":"Support vector machine prediction of N-and O-glycosylation sites using whole sequence information and subcellular localization","authors":"Kenta Sasaki, Nobuyoshi Nagamine, Y. Sakakibara","doi":"10.2197/IPSJTBIO.2.25","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.2.25","url":null,"abstract":"Background: Glycans, or sugar chains, are one of the three types of chain (DNA, protein and glycan) that constitute living organisms; they are often called “the third chain of the living organism”. About half of all proteins are estimated to be glycosylated based on the SWISS-PROT database. Glycosylation is one of the most important post-translational modifications, affecting many critical functions of proteins, including cellular communication, and their tertiary structure. In order to computationally predict N-glycosylation and O-glycosylation sites, we developed three kinds of support vector machine (SVM) model, which utilize local information, general protein information and/or subcellular localization in consideration of the binding specificity of glycosyltransferases and the characteristic subcellular localization of glycoproteins. Results: In our computational experiment, the model integrating three kinds of information achieved about 90% accuracy in predictions of both N-glycosylation and O-glycosylation sites. Moreover, our model was applied to a protein whose glycosylation sites had not been previously identified and we succeeded in showing that the glycosylation sites predicted by our model were structurally reasonable. Conclusions: In the present study, we developed a comprehensive and effective computational method that detects glycosylation sites. We conclude that our method is a comprehensive and effective computational prediction method that is applicable at a genome-wide level.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":"2 1","pages":"25-35"},"PeriodicalIF":0.0,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.2.25","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

A Modified Algorithm for Sequence Alignment Using Ant Colony System 一种改进的蚁群序列比对算法

IPSJ Transactions on Bioinformatics Pub Date : 2009-12-01 DOI: 10.2197/IPSJTBIO.2.63

A. Mikami, Jianming Shi

引用次数: 3

IPSJ Transactions on Bioinformatics Pub Date : 2009-03-24 DOI: 10.2197/IPSJTBIO.2.15

Y. Tohsato, Yuki Nishimura

引用次数: 4

Selection of Effective Sentences from a Corpus to Improve the Accuracy of Identification of Protein Names 从语料库中选择有效句子以提高蛋白质名称识别的准确性

IPSJ Transactions on Bioinformatics Pub Date : 2009-01-01 DOI: 10.2197/IPSJTBIO.2.93

Kazunori Miyanishi, Tomonobu Ozaki, T. Ohkawa

{"title":"Selection of Effective Sentences from a Corpus to Improve the Accuracy of Identification of Protein Names","authors":"Kazunori Miyanishi, Tomonobu Ozaki, T. Ohkawa","doi":"10.2197/IPSJTBIO.2.93","DOIUrl":"https://doi.org/10.2197/IPSJTBIO.2.93","url":null,"abstract":"As the number of documents about protein structural analysis increases, a method of automatically identifying protein names in them is required. However, the accuracy of identification is not high if the training data set is not large enough. We consider a method to extend a training data set based on machine learning using an available corpus. Such a corpus usually consists of documents about a certain kind of organism species, and documents about different kinds of organism species tend to have different vocabularies. Therefore, depending on the target document or corpus, it is not effective for the accurate identification to simply use a corpus as a training data set. In order to improve the accuracy, we propose a method to select sentences that have a positive effect on identification and to extend the training data set with the selected sentences. In the proposed method, a portion of a set of tagged sentences is used as a validation set. The process to select sentences is iterated using the result of the identification of protein names in a validation set as feedback. In the experiment, compared with the baseline, a method without a corpus, with a whole corpus, or with a part of a corpus chosen at random, the accuracy of the proposed method was higher than any baseline method. Thus, it was confirmed that the proposed method selected effective sentences.","PeriodicalId":38959,"journal":{"name":"IPSJ Transactions on Bioinformatics","volume":"2 1","pages":"93-100"},"PeriodicalIF":0.0,"publicationDate":"2009-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.2197/IPSJTBIO.2.93","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68502015","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Nonmetric Distances for Barcode of Life 生命条码的非度量距离

IPSJ Transactions on Bioinformatics Pub Date : 2008-01-01 DOI: 10.2197/IPSJTBIO.1.35

H. Akiba, Y-h. Taguchi

引用次数: 0

A Linear Time Algorithm that Infers Hidden Strings from Their Concatenations 从字符串的连接中推断隐藏字符串的线性时间算法

IPSJ Transactions on Bioinformatics Pub Date : 2008-01-01 DOI: 10.2197/IPSJTBIO.1.13

Tomohiro Yasuda

引用次数: 0