主题发现作为一个多实例问题

2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06) Pub Date : 2006-11-13 DOI:10.1109/ICTAI.2006.89

Ya Zhang, Yixin Chen, Xiang-Hua Ji

{"title":"主题发现作为一个多实例问题","authors":"Ya Zhang, Yixin Chen, Xiang-Hua Ji","doi":"10.1109/ICTAI.2006.89","DOIUrl":null,"url":null,"abstract":"Motif discovery from bio sequences, a challenging task both experimentally and computationally, has been a topic of immense study in recent years. In this paper, we formulate the motif discovery problem as a multiple-instance problem and employ a multiple-instance learning method, the MILES method, to identify motif from biological sequences. Each sequence is mapped into a feature space defined by instances in training sequences with a novel instance-bag similarity measure. We employ I-norm SVM to select important features and construct classifiers simultaneously. These high-ranked features correspond to discovered motifs. We apply this method to discover transcriptional factor binding sites in promoters, a typical motif finding problem in biology, and show that the method is at least comparable to existing methods","PeriodicalId":169424,"journal":{"name":"2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Motif Discovery as a Multiple-Instance Problem\",\"authors\":\"Ya Zhang, Yixin Chen, Xiang-Hua Ji\",\"doi\":\"10.1109/ICTAI.2006.89\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motif discovery from bio sequences, a challenging task both experimentally and computationally, has been a topic of immense study in recent years. In this paper, we formulate the motif discovery problem as a multiple-instance problem and employ a multiple-instance learning method, the MILES method, to identify motif from biological sequences. Each sequence is mapped into a feature space defined by instances in training sequences with a novel instance-bag similarity measure. We employ I-norm SVM to select important features and construct classifiers simultaneously. These high-ranked features correspond to discovered motifs. We apply this method to discover transcriptional factor binding sites in promoters, a typical motif finding problem in biology, and show that the method is at least comparable to existing methods\",\"PeriodicalId\":169424,\"journal\":{\"name\":\"2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06)\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICTAI.2006.89\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2006.89","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

从生物序列中发现基序是一项具有挑战性的实验和计算任务，近年来一直是大量研究的主题。在本文中，我们将基序发现问题表述为一个多实例问题，并采用一种多实例学习方法，即MILES方法，从生物序列中识别基序。每个序列被映射到由训练序列中的实例定义的特征空间中，并采用一种新的实例袋相似性度量。我们使用i -范数支持向量机来选择重要特征并同时构造分类器。这些高阶特征与发现的图案相对应。我们将该方法应用于发现启动子中的转录因子结合位点，这是生物学中典型的基序发现问题，并表明该方法至少与现有方法相当

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Motif Discovery as a Multiple-Instance Problem

Motif discovery from bio sequences, a challenging task both experimentally and computationally, has been a topic of immense study in recent years. In this paper, we formulate the motif discovery problem as a multiple-instance problem and employ a multiple-instance learning method, the MILES method, to identify motif from biological sequences. Each sequence is mapped into a feature space defined by instances in training sequences with a novel instance-bag similarity measure. We employ I-norm SVM to select important features and construct classifiers simultaneously. These high-ranked features correspond to discovered motifs. We apply this method to discover transcriptional factor binding sites in promoters, a typical motif finding problem in biology, and show that the method is at least comparable to existing methods

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06)

自引率

0.00%

发文量