Janina M Jeff, Kristin Brown-Gentry, Robert Goodloe, Marylyn D Ritchie, Joshua C Denny, Abel N Kho, Loren L Armstrong, Bob McClellan, Ping Mayo, Melissa Allen, Hailing Jin, Niloufar B Gillani, Nathalie Schnetz-Boutaud, Holli H Dilks, Melissa A Basford, Jennifer A Pacheco, Gail P Jarvik, Rex L Chisholm, Dan M Roden, M Geoffrey Hayes, Dana C Crawford
{"title":"Replication of <i>SCN5A</i> Associations with Electrocardio-graphic Traits in African Americans from Clinical and Epidemiologic Studies.","authors":"Janina M Jeff, Kristin Brown-Gentry, Robert Goodloe, Marylyn D Ritchie, Joshua C Denny, Abel N Kho, Loren L Armstrong, Bob McClellan, Ping Mayo, Melissa Allen, Hailing Jin, Niloufar B Gillani, Nathalie Schnetz-Boutaud, Holli H Dilks, Melissa A Basford, Jennifer A Pacheco, Gail P Jarvik, Rex L Chisholm, Dan M Roden, M Geoffrey Hayes, Dana C Crawford","doi":"10.1007/978-3-662-45523-4_76","DOIUrl":"10.1007/978-3-662-45523-4_76","url":null,"abstract":"<p><p>The NAv1.5 sodium channel α subunit is the predominant α-subunit expressed in the heart and is associated with cardiac arrhythmias. We tested five previously identified <i>SCN5A</i> variants (rs7374138, rs7637849, rs7637849, rs7629265, and rs11129796) for an association with PR interval and QRS duration in two unique study populations: the Third National Health and Nutrition Examination Survey (NHANES III, n= 552) accessed by the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) and a combined dataset (n= 455) from two biobanks linked to electronic medical records from Vanderbilt University (BioVU) and Northwestern University (NUgene) as part of the electronic Medical Records & Genomics (eMERGE) network. A meta-analysis including all three study populations (n~4,000) suggests that eight <i>SCN5A</i> associations were significant for both QRS duration and PR interval (p<5.0E-3) with little evidence for heterogeneity across the study populations. These results suggest that published <i>SCN5A</i> associations replicate across different study designs in a meta-analysis and represent an important first step in utility of multiple study designs for genetic studies and the identification/characterization of genetic variants associated with ECG traits in African-descent populations.</p>","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"2014 ","pages":"939-951"},"PeriodicalIF":0.0,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4290789/pdf/nihms644245.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32976681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Kargar, Aijun An, N. Cercone, Kayvan Tirdad, Morteza Zihayat
{"title":"Signal detection in genome sequences using complexity based features","authors":"M. Kargar, Aijun An, N. Cercone, Kayvan Tirdad, Morteza Zihayat","doi":"10.1145/2500863.2500867","DOIUrl":"https://doi.org/10.1145/2500863.2500867","url":null,"abstract":"In this work, we tackle the problem of evaluating complexity methods and measures for finding interesting signals in the whole genome of three prokaryotic organisms. In addition to previous complexity measures, new measures are introduced for representing Open Reading Frames (ORF). We apply different classification algorithms to determine which complexity measure results in better predictive performance in discriminating genes from pseudo-genes in ORFs. Also, we investigate whether positions and lengths of windows in ORFs have significant impact on distinguishing between genes and pseudo-genes. Different classification algorithms are applied for classifying ORFs into genes and pseudo-genes.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"729 1","pages":"25-33"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78754758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Drug-target interaction prediction for drug repurposing with probabilistic similarity logic","authors":"Shobeir Fakhraei, L. Raschid, L. Getoor","doi":"10.1145/2500863.2500870","DOIUrl":"https://doi.org/10.1145/2500863.2500870","url":null,"abstract":"The high development cost and low success rate of drug discovery from new compounds highlight the need for methods to discover alternate therapeutic effects for currently approved drugs. Computational methods can be effective in focusing efforts for such drug repurposing. In this paper, we propose a novel drug-target interaction prediction framework based on probabilistic similarity logic (PSL) [5]. Interaction prediction corresponds to link prediction in a bipartite network of drug-target interactions extended with a set of similarities between drugs and between targets. Using probabilistic first-order logic rules in PSL, we show how rules describing link predictions based on triads and tetrads can effectively make use of a variety of similarity measures. We learn weights for the rules based on training data, and report relative importance of each similarity for interaction prediction. We show that the learned rule weights significantly improve prediction precision. We evaluate our results on a dataset of drug-target interactions obtained from Drugbank [27] augmented with five drug-based and three target-based similarities. We integrate domain knowledge in drug-target interaction prediction and match the performance of the state-of-the-art drug-target interaction prediction systems [22] with our model using simple triad-based rules. Furthermore, we apply techniques that make link prediction in PSL more efficient for drug-target interaction prediction.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"33 2 1","pages":"10-17"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72926445","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"MFMS: maximal frequent module set mining from multiple human gene expression data sets","authors":"Saeed Salem, C. Ozcaglar","doi":"10.1145/2500863.2500869","DOIUrl":"https://doi.org/10.1145/2500863.2500869","url":null,"abstract":"Advances in genomic technologies have allowed vast amounts of gene expression data to be collected. Protein functional annotation and biological module discovery that are based on a single gene expression data suffers from spurious coexpression. Recent work have focused on integrating multiple independent gene expression data sets. In this paper, we propose a two-step approach for mining maximally frequent collection of highly connected modules from coexpression graphs. We first mine maximal frequent edge-sets and then extract highly connected subgraphs from the edge-induced subgraphs. Experimental results on the collection of modules mined from 52 Human gene expression data sets show that coexpression links that occur together in a significant number of experiments have a modular topological structure. Moreover, GO enrichment analysis shows that the proposed approach discovers biologically significant frequent collections of modules.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"20 1","pages":"51-57"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83168458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Sabeur Aridhi, Mondher Maddouri, H. Sghaier, E. Nguifo
{"title":"Computational phenotype prediction of ionizing-radiation-resistant bacteria with a multiple-instance learning model","authors":"Sabeur Aridhi, Mondher Maddouri, H. Sghaier, E. Nguifo","doi":"10.1145/2500863.2500866","DOIUrl":"https://doi.org/10.1145/2500863.2500866","url":null,"abstract":"Ionizing-radiation-resistant bacteria (IRRB) are important in biotechnology. The use of these bacteria for the treatment of radioactive wastes is determined by their surprising capacity of adaptation to radionuclides and a variety of toxic molecules. In silico methods are unavailable for the purpose of phenotypic prediction and genotype-phenotype relationship discovery. We analyze basal DNA repair proteins of most known proteomes sequences of IRRB and ionizing-radiation-sensitive bacteria (IRSB) in order to learn a classifier that correctly predicts unseen bacteria. In this work, we formulate the problem of predicting IRRB as a multiple-instance learning (MIL) problem and we propose a novel approach for predicting IRRB. We use a local alignment technique to measure the similarity between protein sequences to predict ionizing-radiation-resistant bacteria. The first results are satisfactory and provide a MIL-based prediction system that predicts whether a bacterium belongs to IRRB or to IRSB. The proposed system is available online.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"10 1","pages":"18-24"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75549674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cheng Zhou, P. Meysman, B. Cule, K. Laukens, Bart Goethals
{"title":"Mining spatially cohesive itemsets in protein molecular structures","authors":"Cheng Zhou, P. Meysman, B. Cule, K. Laukens, Bart Goethals","doi":"10.1145/2500863.2500871","DOIUrl":"https://doi.org/10.1145/2500863.2500871","url":null,"abstract":"In this paper we present a cohesive structural itemset miner aiming to discover interesting patterns in a set of data objects within a multidimensional spatial structure by combining the cohesion and the support of the pattern. The usefulness of this algorithm is demonstrated by applying it to find interesting patterns of amino acids in spatial proximity within a set of proteins based on their atomic coordinates in the protein molecular structure. The experiments show that several patterns found by the cohesive structural itemset miner contain amino acids that frequently co-occur in the spatial structure, even if they are distant in the primary protein sequence and only brought together by protein folding. Further various indications were found that some of the discovered patterns seem to represent common underlying support structures within the proteins.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"7 1","pages":"42-50"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90897089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raed I. Seetan, Ajay Kumar, A. Denton, M. Iqbal, O. Azzam, S. Kianian
{"title":"A fast and scalable clustering-based approach for constructing reliable radiation hybrid maps","authors":"Raed I. Seetan, Ajay Kumar, A. Denton, M. Iqbal, O. Azzam, S. Kianian","doi":"10.1145/2500863.2500868","DOIUrl":"https://doi.org/10.1145/2500863.2500868","url":null,"abstract":"The process of mapping markers from radiation hybrid mapping (RHM) experiments is equivalent to the traveling salesman problem and, thereby, has combinatorial complexity. As an additional problem, experiments typically result in some unreliable markers that reduce the overall quality of the map. We propose a clustering approach for addressing both problems efficiently by eliminating unreliable markers without the need for mapping the complete set of markers. Traditional approaches for eliminating markers use resampling of the full data set, which has an even higher computational complexity than the original mapping problem. In contrast, the proposed approach uses a divide and conquer strategy to construct framework maps based on clusters that exclude unreliable markers. Clusters are ordered using parallel processing and are then combined to form the complete map. Using an RHM data set of the human genome, we compare the framework maps from our proposed approaches with published physical maps and with the Carthagene tool. Overall, our approach has a very low computational complexity and produces solid framework maps with good chromosome coverage and high agreement with the physical map marker order.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"3 1","pages":"34-41"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87213764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Heuristic approaches for time-lagged biclustering","authors":"Joana P. Gonçalves, S. Madeira","doi":"10.1145/2500863.2500865","DOIUrl":"https://doi.org/10.1145/2500863.2500865","url":null,"abstract":"Identifying patterns in temporal data supports complex analyses in several domains, including stock markets (finance) and social interactions (social science). Clinical and biological applications, such as monitoring patient response to treatment or characterizing activity at the molecular level, are also of interest. In particular, researchers seek to gain insight into the dynamics of biological processes, and potential perturbations of these leading to disease, through the discovery of patterns in time series gene expression data. For many years, clustering has remained the standard technique to group genes exhibiting similar response profiles. However, clustering defines similarity across all time points, focusing on global patterns which tend to characterize rather broad and unspecific responses. It is widely believed that local patterns offer additional insight into the underlying intricate events leading to the overall observed behavior. Efficient biclustering algorithms have been devised for the discovery of temporally aligned local patterns in gene expression time series, but the extraction of time-lagged patterns remains a challenge due to the combinatorial explosion of pattern occurrence combinations when delays are considered. We present heuristic approaches enabling polynomial rather than exponential time solutions for the problem.","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"50 1","pages":"1-9"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82039710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Mining for Variability in the Coagulation Pathway: A Systems Biology Approach","authors":"D. Castaldi, D. Maccagnola, D. Mari, F. Archetti","doi":"10.1007/978-3-642-37189-9_14","DOIUrl":"https://doi.org/10.1007/978-3-642-37189-9_14","url":null,"abstract":"","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"5 3","pages":"153-164"},"PeriodicalIF":0.0,"publicationDate":"2013-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72594568","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Structured Populations and the Maintenance of Sex","authors":"P. Whigham, Grant Dick, A. Wright, H. Spencer","doi":"10.1007/978-3-642-37189-9_6","DOIUrl":"https://doi.org/10.1007/978-3-642-37189-9_6","url":null,"abstract":"","PeriodicalId":90497,"journal":{"name":"Evolutionary computation, machine learning and data mining in bioinformatics. EvoBIO (Conference)","volume":"62 1","pages":"56-67"},"PeriodicalIF":0.0,"publicationDate":"2013-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81271383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}