{"title":"Detecting experimental noises in protein-protein interactions with iterative sampling and model-based clustering","authors":"Hiroshi Mamitsuka","doi":"10.1109/BIBE.2003.1188977","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188977","url":null,"abstract":"One of the most important issues in current molecular biology is to build exact networks of protein-protein interactions. Recently developed high-throughput experimental techniques accumulate a vast amount of protein-protein interaction data, but it is well known that data reliability has not reached at a satisfactory level. In this paper we attempt to computationally detect experimental errors or noises presumably contained in the protein-protein interaction data by an iterative sampling method using the learning of a stochastic model as its subroutine. The method repeats two steps of selecting examples that can be regarded as non-noises, and training the component algorithm with the selected examples alternately. Noise candidates are selected as the examples having the smallest average likelihoods computed by previously obtained stochastic models. We empirically evaluated the method with other two methods by using both synthetic and real data sets. We examined the effect of noises and data sizes by using medium- and large-sized synthetic data sets that contain noises added intentionally. The results obtained by the medium-sized synthetic data sets show that the significance level of the performance difference between the method and the two other methods has more pronounced for higher noise ratios. Further experiments show that this experimental finding was also true of a large-scale data set. The performance advantage of the method was further confirmed by the experiments using a real protein-protein interaction data set.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128198396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hsien-Da Huang, Huei-Lin Chang, T. Tsou, Baw-Jhiune Liu, Jorng-Tzong Horng
{"title":"A data mining method to predict transcriptional regulatory sites based on differentially expressed genes in human genome","authors":"Hsien-Da Huang, Huei-Lin Chang, T. Tsou, Baw-Jhiune Liu, Jorng-Tzong Horng","doi":"10.1109/BIBE.2003.1188966","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188966","url":null,"abstract":"Very large-scale gene expression analysis, i.e., UniGene and dbEST, are provided to find those genes with significantly differential expression in specific tissues. The differentially expressed genes in a specific tissue are potentially regulated concurrently by a combination of transcription factors. This study attempts to mine putative binding sites on how combinations of the known regulatory sites homologs and over-represented repetitive elements are distributed in the promoter regions of considered groups of differentially expressed genes. We propose a data mining approach to statistically discover the significantly tissue-specific combinations of known site homologs and over-represented repetitive sequences, which are distributed in the promoter regions of differential gene groups. The association rules mined would facilitate to predict putative regulatory elements and identify genes potentially co-regulated by the putative regulatory elements.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125022528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Tamagawa, F. Nogata, Toyotaka Watanabe, A. Abe, S. Popovic
{"title":"Influence of the thermal treatment applied to PAN gel on its length change and generated force","authors":"H. Tamagawa, F. Nogata, Toyotaka Watanabe, A. Abe, S. Popovic","doi":"10.1109/BIBE.2003.1188964","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188964","url":null,"abstract":"PAN gel is known for its strong matrix as well as for its fast length change by the acid-base environmental solution exchange. Besides, PAN gel is a quite soft material like a real cell. Therefore it's been regarded as a most promising material as an artificial muscle. However, its matrix strength declines extremely unfortunately in basic solution. Its matrix should be improved so as not decline, otherwise it cannot be an artificial muscle for practical use. We applied a high temperature thermal treatment and a subsequent hydrolysis to PAN gel prepared through the nearly conventional processing method, and we investigated the time dependence of its length change ratio and generated force through the acid-base solution exchange, and its durability. Although its length change and force generation performances were impaired to some extent, we found an improvement of its matrix robustness and durability.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125108936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing the Escherichia coli gene expression data by a multilayer adjusted tree organizing map","authors":"Ning Wei, L. Gruenwald, T. Conway","doi":"10.1109/BIBE.2003.1188965","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188965","url":null,"abstract":"Using the DNA microarray technology, biologists have thousands of array data available. Discovering the function relations between genes and their involvements in biological processes depends on the ability to efficiently process and quantitatively analyze large amounts of array data. Clustering algorithms are among the popular tools that can be used to help biologists achieve their goals. Although some existing research projects employed clustering algorithms on biological data, none of them has examined the Escherichia coli (E. coli) gene expression data. This paper proposes a clustering algorithm called Multilayer Adjusted Tree Organizing Map (MA TOM) to analyze the E. coli gene expression data. In a semi-supervised manner, MATOM constructs a multilayer map, and at the same time, removes noise data in the previously trained maps in order to improve the training process. This paper then presents the clustering results produced by MATOM and other existing clustering algorithms using the E. coli gene expression data, and a new evaluation method to assess them. The results show that MATOM performs the best in terms of percentage of genes that are clustered correctly.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122716356","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of contact maps using support vector machines","authors":"Ying Zhao, G. Karypis","doi":"10.1109/BIBE.2003.1188926","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188926","url":null,"abstract":"Contact map prediction is of great interest for its application in fold recognition and protein 3D structure determination. In this paper we present a contact-map prediction algorithm that employs Support Vector Machines as the machine learning tool and incorporates various features such as sequence profiles and their conservation, correlated mutation analysis based on various amino acid physicochemical properties, and secondary structure. In addition, we evaluated the effectiveness of the different features on contact map prediction for different fold classes. On average, our predictor achieved a prediction accuracy of 0.2238 with an improvement over a random predictor of a factor 11.7, which is better than reported studies. Our study showed that predicted secondary structure features play an important roles for the proteins containing beta structures. Models based on secondary structure features and CMA features produce different sets of predictions. Our study also suggests that models learned separately for different protein fold families may achieve better performance than a unified model.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127652614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Nodal distance algorithm: calculating a phylogenetic tree comparison metric","authors":"John Bluis, Dong-Guk Shin","doi":"10.1109/BIBE.2003.1188933","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188933","url":null,"abstract":"Maintaining a phylogenetic relationship repository requires the development of tools that are useful for mining the data stored in the repository. One way to query a database of phylogenetic information would be to compare phylogenetic trees. Because the only existing tree comparison methods are computationally intensive, this is not a reasonable task. Presented here is the nodal distance algorithm which has significantly less computation time than the most widely used comparison method, the partition metric. When the metric is calculated for trees where one species has been repositioned to a distant part of the tree no further computation is required as is needed for the partition metric. The nodal distance algorithm provides a method for comparing large sets of phylogenetic trees in a reasonable amount of time.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123026454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An open multiple instance learning framework and its application in drug activity prediction problems","authors":"Xin Huang, Shu‐Ching Chen, M. Shyu","doi":"10.1109/BIBE.2003.1188929","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188929","url":null,"abstract":"In this paper, a powerful open Multiple Instance Learning (MIL) framework is proposed. Such an open framework is powerful since different sub-methods can be plugged into the framework to generate different specific Multiple Instance Learning algorithms. In our proposed framework, the Multiple Instance Learning problem is first converted to an unconstrained optimization problem by the Minimum Square Error (MSE) criterion, and then the framework can be constructed with an open form of hypothesis and gradient search method. The proposed Multiple Instance Learning framework is applied to the drug activity problems in bioinformatics applications. Specifically, experiments are conducted on the Musk-I dataset to predict the binding activity of drug molecules. In the experiments, an algorithm with the exponential hypothesis model and the Quasi-Newton method is embedded into our proposed framework. We compare our proposed framework with other existing algorithms and the experimental results show that our proposed framework yields a good accuracy of classification, which demonstrates the feasibility and effectiveness of our framework.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121361398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Vessel extraction techniques and algorithms: a survey","authors":"C. Kirbas, Francis K. H. Quek","doi":"10.1109/BIBE.2003.1188957","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188957","url":null,"abstract":"Vessel segmentation algorithms are critical components of circulatory blood vessel analysis systems. We present a survey of vessel extraction techniques and algorithms, putting the various approaches and techniques in perspective by means of a classification of the existing research. While we target mainly the extraction of blood vessels, neurovascular structure in particular we also review some of the segmentation methods for the tubular objects that show similar characteristics to vessels. We divide vessel segmentation algorithms and techniques into six main categories: (1) pattern recognition techniques, (2) model-based approaches, (3) tracking-based approaches, (4) artificial intelligence-based approaches, (5) neural network-based approaches, and (6) miscellaneous tube-like object detection approaches. Some of these categories are further divided into sub-categories. A table compares the papers against such criteria as dimensionality, input type, preprocessing, user interaction, and result type.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131809087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A robotic device for minimally invasive breast interventions with real-time MRI guidance","authors":"B. Larson, N. Tsekos, A. Erdman","doi":"10.1109/BIBE.2003.1188946","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188946","url":null,"abstract":"We have developed a device to perform minimally invasive interventions in the breast with realtime MRI guidance for the early detection and treatment of breast cancer. The device uses five computer-controlled degrees of freedom to perform minimally invasive interventions inside a closed MRI scanner. Typically the intervention would consist of a biopsy of the suspicious lesion for diagnosis, but may involve therapies to destroy or remove malignant tissue in the breast. The procedure proceeds with: (a) conditioning of the breast along a prescribed orientation, (b) definition of an insertion vector by its height and pitch angle, and (c) insertion into the breast. The entire device is made of materials compatible with MRI, avoiding artifacts and distortion of the local magnetic field. The device is remotely controlled via a graphical user interface. This is the first surgical robotic device to perform real-time MRI-guided breast interventions in the United States.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"156 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116604188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Enhanced biclustering on expression data","authors":"Jiong Yang, Haixun Wang, Wei Wang, Philip S. Yu","doi":"10.1109/BIBE.2003.1188969","DOIUrl":"https://doi.org/10.1109/BIBE.2003.1188969","url":null,"abstract":"Microarrays are one of the latest breakthroughs in experimental molecular biology, which provide a powerful tool by which the expression patterns of thousands of genes can be monitored simultaneously and are already producing huge amount of valuable data. The concept of bicluster was introduced by Cheng and Church (2000) to capture the coherence of a subset of genes and a subset of conditions. A set of heuristic algorithms were also designed to either find one bicluster or a set of biclusters, which consist of iterations of masking null values and discovered biclusters, coarse and fine node deletion, node addition, and the inclusion of inverted data. These heuristics inevitably suffer from some serious drawback. The masking of null values and discovered biclusters with random numbers may result in the phenomenon of random interference which in turn impacts the discovery of high quality biclusters. To address this issue and to further accelerate the biclustering process, we generalize the model of bicluster to incorporate null values and propose a probabilistic algorithm (FLOC) that can discover a set of k possibly overlapping biclusters simultaneously. Furthermore, this algorithm can easily be extended to support additional features that suit different requirements at virtually little cost. Experimental study on the yeast gene expression data shows that the FLOC algorithm can offer substantial improvements over the previously proposed algorithm.","PeriodicalId":178814,"journal":{"name":"Third IEEE Symposium on Bioinformatics and Bioengineering, 2003. Proceedings.","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2003-03-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121084402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}