Identifying Functional Binding Motifs of Tumor Protein p53 Using Support Vector Machines

Amit U. Sinha, Mukta Phatak, Raj Bhatnagar, Anil G. Jegga
{"title":"Identifying Functional Binding Motifs of Tumor Protein p53 Using Support Vector Machines","authors":"Amit U. Sinha, Mukta Phatak, Raj Bhatnagar, Anil G. Jegga","doi":"10.1109/ICMLA.2007.46","DOIUrl":null,"url":null,"abstract":"Identification of transcription factor binding site in DNA sequences is a frequently performed task in bioinformatics. However, current methods of search produce a large number of false positives as these motifs are short and degenerate. We propose an implicit model of cooperative binding of transcription factors. We hypothesize that flanking regions of binding sites have a different composition compared to regions which do not have that binding site. Using statistically significant motifs in flanking region of true binding sites as features, we design a SVM classifier for discriminating true binding sites from false positives. We demonstrate the effectiveness of our method on a data set of experimentally verified p53 binding sites. We were able to obtain an overall accuracy of 80% and 76% on cross- validation and independent test set, respectively. By analyzing the features, we identified known as well as potentially new binding partners of p53.","PeriodicalId":448863,"journal":{"name":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Machine Learning and Applications (ICMLA 2007)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2007.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11

Abstract

Identification of transcription factor binding site in DNA sequences is a frequently performed task in bioinformatics. However, current methods of search produce a large number of false positives as these motifs are short and degenerate. We propose an implicit model of cooperative binding of transcription factors. We hypothesize that flanking regions of binding sites have a different composition compared to regions which do not have that binding site. Using statistically significant motifs in flanking region of true binding sites as features, we design a SVM classifier for discriminating true binding sites from false positives. We demonstrate the effectiveness of our method on a data set of experimentally verified p53 binding sites. We were able to obtain an overall accuracy of 80% and 76% on cross- validation and independent test set, respectively. By analyzing the features, we identified known as well as potentially new binding partners of p53.
利用支持向量机识别肿瘤蛋白p53的功能结合基序
DNA序列中转录因子结合位点的鉴定是生物信息学中经常进行的任务。然而,目前的搜索方法产生了大量的假阳性,因为这些基序是短的和退化的。我们提出了一个隐式模型的合作结合的转录因子。我们假设结合位点的侧翼区域与没有该结合位点的区域相比具有不同的组成。利用真结合位点侧翼区域统计上显著的基序作为特征,设计了一种判别真结合位点与假阳性的SVM分类器。我们在实验验证的p53结合位点数据集上证明了我们方法的有效性。我们能够在交叉验证和独立测试集上分别获得80%和76%的总体准确度。通过分析这些特征,我们确定了p53已知的和潜在的新的结合伙伴。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信