S. N. Nagabhushan, T. Ahn, M. Srikanth, T. Park, Ajit S. Bopardikar, R. Narayanan
{"title":"A data aggregation framework for cancer subtype discovery","authors":"S. N. Nagabhushan, T. Ahn, M. Srikanth, T. Park, Ajit S. Bopardikar, R. Narayanan","doi":"10.1109/BIBMW.2012.6470250","DOIUrl":null,"url":null,"abstract":"Personalized genomic medicine aims to revolutionize healthcare by applying our growing understanding of the molecular basis of disease for effective diagnosis and personalized therapy. Computational research in this arena has major challenges such as handling large volume of highly heterogeneous data sets. To extract knowledge, researchers must integrate data from several sources and efficiently query these large and diverse data sets. This presents daunting informatics challenges such as suitable data representation for computational inference (knowledge representation), linking heterogeneous data sets (data integration) and keeping track of the source of the data to be aggregated. Many of these challenges can be categorized as data integration problems. In this paper, we present relevant methodologies from the field of data integration as potential solution for such challenges encountered by computational biologist while handling diversified data. The work presented in the paper represents the first crucial step towards identifying cancer biomarkers leading to cancer pathways signatures and personalized medicine.","PeriodicalId":6392,"journal":{"name":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2012-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIBMW.2012.6470250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Personalized genomic medicine aims to revolutionize healthcare by applying our growing understanding of the molecular basis of disease for effective diagnosis and personalized therapy. Computational research in this arena has major challenges such as handling large volume of highly heterogeneous data sets. To extract knowledge, researchers must integrate data from several sources and efficiently query these large and diverse data sets. This presents daunting informatics challenges such as suitable data representation for computational inference (knowledge representation), linking heterogeneous data sets (data integration) and keeping track of the source of the data to be aggregated. Many of these challenges can be categorized as data integration problems. In this paper, we present relevant methodologies from the field of data integration as potential solution for such challenges encountered by computational biologist while handling diversified data. The work presented in the paper represents the first crucial step towards identifying cancer biomarkers leading to cancer pathways signatures and personalized medicine.