Wenan Chen, Charles Cockrell, Kevin Ward, K. Najarian
{"title":"Intracranial pressure level prediction in traumatic brain injury by extracting features from multiple sources and using machine learning methods","authors":"Wenan Chen, Charles Cockrell, Kevin Ward, K. Najarian","doi":"10.1109/BIBM.2010.5706619","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706619","url":null,"abstract":"This paper proposes a non-intrusive method to predict/estimate the intracranial pressure (ICP) level based on features extracted from multiple sources. Specifically, these features include midline shift measurement and texture features extracted from CT slices, as well as patient's demographic information, such as age. Injury Severity Score is also considered. After aggregating features from slices, a feature selection scheme is applied to select the most informative features. Support vector machine (SVM) is used to train the data and build the prediction model. The validation is performed with 10 fold cross validation. To avoid overfitting, all the feature selection and parameter selection are done using training data during the 10 fold cross validation for evaluation. This results an nested cross validation scheme implemented using Rapidminer. The final classification result shows the effectiveness of the proposed method in ICP prediction.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"12 4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123704259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A generalized sequence pattern matching algorithm using complementary dual-seeding","authors":"Bing Ni, Leung-Yau Lo, K. Leung","doi":"10.1109/BIBM.2010.5706593","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706593","url":null,"abstract":"In this work, we define generalized (sequence) patterns, which is based on several real Biological problems, including transcription factors (TFs) binding to transcription factor binding sites (TFBSs), cis-regulatory modules, protein domain analysis, and alternative splicing etc. Simply speaking, a generalized pattern is composed of several substrings with gaps in-between two substrings. We propose a generalized pattern matching algorithm that uses a complementary dualseeding strategy, which is sensitive to errors (both mismatches and indels). We also develop a generalized pattern matching tool1, which is to our knowledge the first ever developed specially for generalized pattern matching. Rather than replacing the existing general purpose matching tools, such as BLAST, BLAT, and PatternHunter etc, our tool provides an alternative and helps users to solve real problems, especially those that can be modeled as generalized patterns. We use data randomly sampled from reference sequences of human genome (NCBI build v18) in experiments, and hit 98.74% generalized patterns on average. The tool runs on both LINUX and Windows platforms, and the memory peak goes to a little bit larger than 1GB only.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125001716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Gene expression rule discovery with a multi-objective neural-genetic hybrid","authors":"E. Keedwell, A. Narayanan","doi":"10.1109/BIBM.2010.5706646","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706646","url":null,"abstract":"Recent advances in microarray technology allow an unprecedented view of the biochemical mechanisms contained within a cell. Deriving useful information from the data is still proving to be a difficult task. In this paper a novel method based on a multi-objective genetic algorithm that discovers relevant sets of genes and uses a neural network to create rules using the evolved genes is described. This hybrid method is shown to work on four well-established gene expression datasets taken from the literature. The results indicate that the approach can return biologically intelligible as well as plausible results. The proposed method requires no pre-filtering or preselection of genes.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129277712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MiRNAs as promising phylogenetic markers for inferring deep metazoan phylogeny and in support of Olfactores hypothesis","authors":"Q. Cai, Xiaoyan Zhang, Zuofeng Li","doi":"10.1109/BIBM.2010.5706545","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706545","url":null,"abstract":"The Long Branch attraction (LBA) artefact induced by fast evolving Urochordata had hindered the interpretation of relationships among 3 subphyla of Chordata. Although Olfactores hypothesis which placed Urochordata rather than Cephalochordata as the closest relatives to Craniata was gradually accepted, every step of phylogenetic reconstruction had to be treated prudential to minimize LBA phenomenon. MiRNAs (microRNAs) are well known for their 1) adherence to organism development, 2) high conservation, and 3) rarity of secondary loss, parallel evolution, and convergence among metazoan. Therefore we suppose miRNAs to be promising candidates to dispel LBA phenomenon. We performed a phylogenetic study upon 35 pre-miRNA datasets and reconstruct Chordata phylogeny which supported Olfactores hypothesis in a more toilless way by applying fewer datasets and unspecified substitution model. This is the first attempt to apply miRNA sequences in interpreting Chordata phylogeny, and we reckon miRNAs as promising phylogenetic markers for illuminating deuterostome evolution.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125277632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kazi Zakia Sultana, Anupam Bhattacharjee, H. Jamil
{"title":"IsoKEGG: A logic based system for querying biological pathways in KEGG","authors":"Kazi Zakia Sultana, Anupam Bhattacharjee, H. Jamil","doi":"10.1109/BIBM.2010.5706642","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706642","url":null,"abstract":"Understanding the interaction patterns among a set of biological entities in a pathway is an important exercise because it potentially could reveal the role of the entities in biological systems. Although a considerable amount of effort has been directed to the detection and mining of patterns in biological pathways in contemporary research, querying biological pathways remained relatively unexplored. Querying is principally different in which we retrieve pathways that satisfy a given property in terms of its topology, or constituents. One such property is subnetwork matching using various constituent parameters. In this paper, we introduce a logic based framework for querying biological pathways based on a novel and generic subgraph isomorphism computation technique. We cast this technique into a graphical interface called IsoKEGG to facilitate flexible querying of KEGG pathways. We demonstrate that IsoKEGG is flexible enough to allow querying based on isomorphic pathway topologies as well as matching any combination of node names, types, and edges. It also allows editing KGML represented query pathways and returns all possible pathways in KEGG that satisfy a given query condition that the users are able to investigate further.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125331086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Classification of genome-wide copy number variations and their associated SNP and gene networks analysis","authors":"Yang Liu, Yiu-Fai Lee, M. Ng","doi":"10.1109/BIBM.2010.5706526","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706526","url":null,"abstract":"Detection of genomic DNA copy number variations (CNVs) can provide a complete and more comprehensive view of human disease. In this paper, we incorporate DNA copy number variation data derived from SNP arrays into a computational shrunken model and formalize the detection of copy number variations as a case-control classification problem. By shrinkage, the number of relevant CNVs to disease can be determined. In order to understand relevant CNVs, we study their corresponding SNPs in the genome and find out the unique genes that those SNPs are located in. A gene-gene similarity value is computed using GOSemSim and gene pairs that has a similarity value being greater than a threshold are selected to construct several groups of genes. For the SNPs that involved in these groups of genes, a statistical software PLINK is employed to compute the pair-wise SNP-SNP interactions, and identify SNP networks based on their p-values. By using two real genome-wide data sets, we further demonstrate SNP and gene networks play a role in the biological process. An analysis shows that such networks have relationships directly or indirectly to disease study.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116125542","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improving robustness of gene ranking by resampling and permutation based score correction and normalization","authors":"Feng Yang, K. Mao","doi":"10.1109/BIBM.2010.5706607","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706607","url":null,"abstract":"Feature ranking, which ranks features via their individual importance, is one of the frequently used feature selection techniques. Traditional feature ranking criteria are apt to produce inconsistent ranking results even with light perturbations in training samples when applied to high dimensional and small-sized gene expression data. A widely used strategy for solving the inconsistencies is the multi-criterion combination. But one problem encountered in combining multiple criteria is the score normalization. In this paper, problems in existing methods are first analyzed, and a new gene importance transformation algorithm is then proposed. Experimental studies on three popular gene expression datasets show that the multi-criterion combination based on the proposed score correction and normalization produces gene rankings with improved robustness.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127649056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. Rueda, Sridip Banerjee, Md. Mominul Aziz, Mohammad Raza
{"title":"Protein-protein interaction prediction using desolvation energies and interface properties","authors":"L. Rueda, Sridip Banerjee, Md. Mominul Aziz, Mohammad Raza","doi":"10.1109/BIBM.2010.5706528","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706528","url":null,"abstract":"An important aspect in understanding and classifying protein-protein interactions (PPI) is to analyze their interfaces in order to distinguish between transient and obligate complexes. We propose a classification approach to discriminate between these two types of complexes. Our approach has two important aspects. First, we have used desolvation energies — amino acid and atom type — of the residues present in the interface, which are the input features of the classifiers. Principal components of the data were found and then the classification is performed via linear dimensionality reduction (LDR) methods. Second, we have investigated various interface properties of these interactions. From the analysis of protein quaternary structures, physicochemical properties are treated as the input features of the classifiers. Various features are extracted from each complex, and the classification is performed via different linear dimensionality reduction (LDR) methods. The results on standard benchmarks of transient and obligate protein complexes show that (i) desolvation energies are better discriminants than solvent accessibility and conservation properties, among others, and (ii) the proposed approach outperforms previous solvent accessible area based approaches using support vector machines.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130236729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of DNA-binding protein based on alpha shape modeling","authors":"Weiqiang Zhou, Hong Yan","doi":"10.1109/BIBM.2010.5706529","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706529","url":null,"abstract":"Previous studies about protein-DNA interaction focused on the bound structure of DNA-binding proteins and provided good but not practical results. In our work, we apply an alpha shape model to represent the surface structure of the protein-DNA complex and use structural alignment to develop an interface-atom curvature-dependent conditional probability discriminatory function for the prediction of unbound DNA-binding protein. The proposed method provides good performance in predicting unbound structure of DNA-binding protein which is potentially useful in many fields. Computer experiment results show that the curvature-dependent formalism with the optimal parameters can achieve sensitivity ranges from 48.08% to 44.23% and specificity ranges from 73.82% to 84.29%.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"637 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132971390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fuzzy C-means method with empirical mode decomposition for clustering microarray data","authors":"Yanfei Wang, Zuguo Yu, V. Anh","doi":"10.1109/BIBM.2010.5706561","DOIUrl":"https://doi.org/10.1109/BIBM.2010.5706561","url":null,"abstract":"Microarray techniques have revolutionized genomic research by making it possible to monitor the expression of thousands of genes in parallel. Data clustering analysis has been extensively applied to extract information from gene expression profiles obtained with DNA microarrays. Existing clustering approaches, mainly developed in computer science, have been adapted to microarray data. Among these approaches, fuzzy C-means (FCM) method is an efficient one. However, microarray data contains noise and the noise would affect clustering results. Some clustering structure still can be found from random data without any biological significance. In this paper, we propose to combine the FCM method with the empirical mode decomposition (EMD) for clustering microarray data in order to reduce the effect of the noise. We call this method fuzzy C-means method with empirical mode decomposition (FCM-EMD). Using the FCM-EMD method on gene microarray data, we obtained better results than those using FCM only. The results suggest the clustering structures of denoised data are more reasonable and genes have tighter association with their clusters. Denoised gene data without any biological information contains no cluster structure. We find that we can avoid estimating the fuzzy parameter m in some degree by analyzing denoised microarray data. This makes clustering more efficient. Using the FCM-EMD method to analyze gene microarray data can save time and obtain more reasonable results.","PeriodicalId":275098,"journal":{"name":"2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132080575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}