IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences最新文献

筛选
英文 中文
Poster: Diagnosing and treating code-duplication problems in bioinformatics libraries 海报:诊断和处理生物信息学图书馆中的代码重复问题
M. S. Hasan, S. Tithi, E. Tilevich, Liqing Zhang
{"title":"Poster: Diagnosing and treating code-duplication problems in bioinformatics libraries","authors":"M. S. Hasan, S. Tithi, E. Tilevich, Liqing Zhang","doi":"10.1109/ICCABS.2016.7802784","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802784","url":null,"abstract":"As computing is an enabling tool of bioinformatics, software quality can influence not only the efficiency of the research process, but also the degree of confidence in scientific findings. As we discovered, popular bioinformatics C++ libraries suffer from problems that make their code hard to maintain, finetune, and extend. In particular, code duplication caused by the ubiquitous copy-and-paste development practice, substantially complicates software maintenance and evolution. The presence of multiple clones of the same code snippet multiples the amount of effort required to modify or extend it. In this paper, we present the results of a systematic study we have conducted to understand the code quality of popular bioinformatics libraries. Based on the results of our study, we developed an automated tool that systematically identifies and consolidates duplicated code blocks. Here we describe our tool—ReBio1—and the results of applying it to improve the quality of several commonly used C++ libraries, including SeqAn, BEDtools, and NCBI C++ Toolkit. Our results reveal that these libraries indeed suffer from poor maintainability, and that our automated tool can effectively improve their quality.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"13 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77747289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Detecting patterns of co-variation in deep-sequenced virus populations 检测深度测序病毒群体的共变异模式
Susana Posada-Céspedes, David Seifert, N. Beerenwinkel
{"title":"Detecting patterns of co-variation in deep-sequenced virus populations","authors":"Susana Posada-Céspedes, David Seifert, N. Beerenwinkel","doi":"10.1109/ICCABS.2016.7802787","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802787","url":null,"abstract":"Advances in high-throughput sequencing (HTS) technologies have facilitated the assessment of the genetic diversity of heterogeneous virus populations at an unprecedented level of detail. However, the existence of technical errors confounds the identification of truthful variants. Here, we present a comparative approach for the identification of patterns of co-variation in deep-sequenced virus populations. In addition to sequencing errors, we account for other unknown sources of error by modeling the occurrences of patterns of mutations using the Dirichlet distribution as prior for the multinomial distribution.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"25 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90434329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computationally resolving heterogeneity in mixed genomic samples 计算解决混合基因组样本的异质性
R. Schwartz
{"title":"Computationally resolving heterogeneity in mixed genomic samples","authors":"R. Schwartz","doi":"10.1109/ICCABS.2016.7802796","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802796","url":null,"abstract":"With ever-advancing genomic technologies, it has become increasingly clear that cell-to-cell genomic variability is a ubiquitous feature of multicellular systems with importance to numerous phenomena in health and disease. While technologies for single-cell genomics are rapidly improving, though, they are still impractical for the scales needed to characterize genomic heterogeneity of complex mixtures across large patient populations, leaving the field highly dependent on computational inference to fill in the gaps in what it is practical to measure experimentally. Genomic deconvolution and phylogenetic methods have become subfields in themselves for making sense of still-limited genomic data in terms of coherent models of genomic heterogeneity. There is probably no system for which this phenomenon has been more intensively studied than cancers, where cell-to-cell genetic heterogeneity is now appreciated as key to tumor initiation, progression, and response to treatment. This talk will explore computational challenges in reconstructing models of genomic heterogeneity and the evolutionary processes by which it develops, as well as strategies for meeting those challenges, with particular focus on intra-tumor heterogeneity. It will in the process explore computational strategies for various sources of genomic data (bulk, single-cell, and combinations) and examine the tradeoffs between them. It will conclude with consideration of some emerging directions and open problems in studies of heterogeneity in multicellular systems, in cancers and beyond.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"106 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78706962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Region-based custom chip description formats for reanalysis of publicly available affymetrix® genechip® data sets 基于区域的定制芯片描述格式,用于重新分析公开可用的affymetrix®基因芯片®数据集
Ernur Saka, Benjamin J. Harrison, Kirk L. West, J. Petruska, E. Rouchka
{"title":"Region-based custom chip description formats for reanalysis of publicly available affymetrix® genechip® data sets","authors":"Ernur Saka, Benjamin J. Harrison, Kirk L. West, J. Petruska, E. Rouchka","doi":"10.1109/ICCABS.2016.7802781","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802781","url":null,"abstract":"Commercially developed microarrays, such as those from Agilent® and Affymetrix®, allow for the analysis of differential gene expression changes on a genome-wide scale. Publicly repositories of microarray data, most notably ArrayExpress and the Gene Expression Omnibus (GEO) have made available millions of microarray samples to researchers worldwide. One of the drawbacks of microarray technology is the static construction of probes based on current genomic knowledge and gene annotation information available at the design phase. As the knowledge base about genes expands, including alternative isoform formation and alternative polyadenylation signaling, the need for a dynamically changing approach to microarray expression analysis has become apparent. We have therefore designed a framework for the reanalysis of publicly available microarray datasets by updating probe set construction based on gene, transcript, and region-based (UTR, exon, CDS) annotations. Our analysis of two publicly available GEO series, GSE48611 and GSE72551, illustrate that the analysis of expression changes using different annotation groupings yields additional insight into changes in transcript expression, in particular, 3' UTR dynamics, which are likely to present phenotypical differences.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"1 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73245753","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-automatic mining of correlated data from a complex database: Correlation network visualization 复杂数据库中关联数据的半自动挖掘:关联网络可视化
M. Lexa, Radovan Lapar
{"title":"Semi-automatic mining of correlated data from a complex database: Correlation network visualization","authors":"M. Lexa, Radovan Lapar","doi":"10.1109/ICCABS.2016.7802783","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802783","url":null,"abstract":"In previous work we have addressed the issue of frequent ad-hoc queries in deeply-structured databases. We wrote a library of functions AutodenormLib.py for issuing proper JOIN commands to denormalize an arbitrary subset of stored data for downstream processing. This may include statistical analysis, visualization or machine learning. Here, we visualize the content of the Thalamoss biomedical database as a correlation network. The network is created by calculating pairwise correlations through all pairs of variables, whether they be numerical, ordinal or nominal. We subsequently construct the network over the entire set of variables, clustering variables with similar effects to discover group relationships between the various biomedical characteristics. We use a semi-automatic procedure that makes the selection of all pairs possible and discuss issues of dealing with different types of variables. This is done either by limiting the analysis to numerical and ordinal ones, or by binning their values into intervals of values. Knowledge extracted from the data in this mode can be used to select variables for statistical models, or as markers of medically interesting conditions.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"8 1","pages":"1-2"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73127202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Computational analysis of drug addiction epidemiology by integrating molecular mapping and social media signals 结合分子作图和社交媒体信号的药物成瘾流行病学计算分析
Rahul Singh
{"title":"Computational analysis of drug addiction epidemiology by integrating molecular mapping and social media signals","authors":"Rahul Singh","doi":"10.1109/ICCABS.2016.7802786","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802786","url":null,"abstract":"Drug abuse is amongst the most significant factors impacting the health of Americans. The dynamic nature of this problem is characterized by a number of issues including the continual penetration of novel chemical entities into the abuse-dependency cycle, recognition of dependency elicited by entities, such as opioids, that were hitherto considered to be harmless, the multistage nature of the addiction process in an individual, and finally the spread to ever-different sections of the populace. The interplay of these factors makes early identification of emerging substance use trends, studying the epidemiology, and designing effective interventions especially complex. This research seeks to ameliorate this complexity by integrating two methodological directions: molecular maps that help contextualize the chemical etiology of addiction and creation of dynamic models of addiction through extraction, modeling and analysis of human-factors related information from a relatively new source, namely, social media.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"36 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89990156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prediction of missing common genes for disease pairs using network based module separation 基于网络模块分离的疾病对缺失共同基因预测
P. Akram, Li Liao
{"title":"Prediction of missing common genes for disease pairs using network based module separation","authors":"P. Akram, Li Liao","doi":"10.1109/ICCABS.2016.7802782","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802782","url":null,"abstract":"Identifying genes that are associated with two or more diseases can shed lights on understanding the pathobiological mechanisms of these diseases. In this work we present a novel method to predict missing common genes for disease pairs. The method formulates searching for missing common genes as an optimization problem to minimize a network based module separation between two subgraphs formed by mapping the disease associated genes onto the interactome. Tested on a dataset of more than 600 disease pairs using cross-validation, it is shown that the method achieves an average ROC score of 0.95.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"165 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85183338","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A deep learning method for lincRNA identification using auto-encoder algorithm 利用自编码器算法进行lincRNA识别的深度学习方法
Ning Yu, Zeng Yu, Yi Pan
{"title":"A deep learning method for lincRNA identification using auto-encoder algorithm","authors":"Ning Yu, Zeng Yu, Yi Pan","doi":"10.1109/ICCABS.2016.7802797","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802797","url":null,"abstract":"LincRNAs are four times more than coding RNA sequences. However, currently only 21 thousand lincRNAs are computationally discovered [1]. Although this was one of the most important findings in lincRNA identification, identification of lincRNAs is far from being complete and those predicted lincRNAs are not validated yet. Currently new identified lincRNAs are most from the computational analysis of RNA-seq transcript data while deep learning based methods are barely seen in detecting and validating lincRNAs.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"94 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77119184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Genome-wide identification and evolutionary analysis of long non-coding RNAs in cereals 谷物长链非编码rna的全基因组鉴定与进化分析
Ying Sun, W. L. Rogers, K. Devos, Liming Cai, R. Malmberg
{"title":"Genome-wide identification and evolutionary analysis of long non-coding RNAs in cereals","authors":"Ying Sun, W. L. Rogers, K. Devos, Liming Cai, R. Malmberg","doi":"10.1109/ICCABS.2016.7802791","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802791","url":null,"abstract":"We identified lncRNA candidates in four economically important cereals (Poaceae): 7,196 in Zea mays, 1,974 in Sorghum bicolor, 4,236 in Setaria italica and 2,542 in Oryza sativa, using computational methods; we then compared these RNAs across the species. Our approach involved screening a reference-guided transcriptome assembly of RNA-Seq data for RNAs that were at least 200 bases in length with at most 70 amino acids in open reading frames and with a lack of homology in the Uniprot database. A sequence composition analysis of the lncRNA candidates, in comparison to protein-coding transcripts, highlighted distinctive features, including a low GC content, a paucity of introns and a hexamer usage bias, consistent with what has been found for mammalian lncRNAs. RepeatMasker identified from 1% (rice) to 19% (maize) of the candidate lncRNAs as being transcribed from transposable elements, based on a dataset with 3,853 transposable elements. We compared the candidate lncRNAs with 25,141 miRNAs from miRBase, and found that less than 1% of them could be potential miRNA precursors. The cross-species comparisons, which included a sequence- and structure-based lncRNA homology search, synteny analysis, and lncRNA secondary structure prediction, uncovered some limited sequence similarity. In sub-regions, we predicted conserved secondary structures using covariation analysis. We used the comparative sequence and synteny analyses to predict the existence of lncRNAs in S. italica; experimental tests confirmed the presence of these RNAs. Our results are consistent with a model of very rapid evolution of lncRNAs.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"12 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87088111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
UPS-indel: A better approach for finding indel redundancy UPS-indel:一种更好的查找indel冗余的方法
M. S. Hasan, Xiaowei Wu, L. Watson, Zhiyi Li, Liqing Zhang
{"title":"UPS-indel: A better approach for finding indel redundancy","authors":"M. S. Hasan, Xiaowei Wu, L. Watson, Zhiyi Li, Liqing Zhang","doi":"10.1109/ICCABS.2016.7802793","DOIUrl":"https://doi.org/10.1109/ICCABS.2016.7802793","url":null,"abstract":"Indel which represents the insertion and deletion of base pairs in the sequence of an organism is a very common form of genetic variation that takes place in the human genome. Being responsible for genetic diversity and human disease, indels have been considered as an important area in the genome research community. With progress in Next Generation Sequencing (NGS), a good number of indel calling tools have been developed and different databases store the results of different indel calling tools for future research. Different indels, though differing in allele sequence and position, can be biologically equivalent when they lead to the same altered sequences. Storing these biologically equivalent indels as distinct entries in databases causes data redundancy. Previous research showed that about 10% human indels stored in dbSNP are redundant due to lack of a unified system for identifying and representing equivalent indels. In this paper we describe UPS-indel, a utility tool that creates a universal positioning system for indels so that equivalent indels can be identified easily by a simple comparison of their coordinates generated by the proposed positioning system. Applying UPS-indel, we identify nearly 15% redundant indels in dbSNP (version 142) across all human chromosomes, higher than the previous report. UPS-indel is written in C++ and is freely available at http://bench.cs.vt.edu/ups-indel.","PeriodicalId":89933,"journal":{"name":"IEEE ... International Conference on Computational Advances in Bio and Medical Sciences : [proceedings]. IEEE International Conference on Computational Advances in Bio and Medical Sciences","volume":"90 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88528795","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信