S. Sikdar, Hyoyoung Choo Wosoba, Younathan Abdia, S. Dutta, R. Gill, S. Datta, S. Datta
{"title":"对ICGC癌症基因组肺腺癌研究的组学数据进行综合探索性分析","authors":"S. Sikdar, Hyoyoung Choo Wosoba, Younathan Abdia, S. Dutta, R. Gill, S. Datta, S. Datta","doi":"10.1080/21628130.2015.1040618","DOIUrl":null,"url":null,"abstract":"It is known that all agents that cause cancer (carcinogens) also cause a change in the DNA sequence. In order to identify such often subtle changes, we attempt to integrate multiple molecular profile data sets released by the International Cancer Genome Consortium (ICGC). The list of data sets includes matched gene and microRNA expression profiles, somatic copy number variation, DNA methylation, and protein expression profiles for lung adenocarcinoma patients receiving treatments. We consider both unsupervised and supervised learning techniques (clustering and penalized regression) to identify interesting molecular markers corresponding to each type of –omics profiles that can differentiate patients. Associations between important markers of 2 types have been studied. An adaptive ensemble binary regression model has been presented that uses the entirety of available –omics profiles leading to a more accurate clinical prognosis for the patients in the given sample. This integrated study provides a more comprehensive picture of lung adenocarcinoma.","PeriodicalId":90057,"journal":{"name":"Systems biomedicine (Austin, Tex.)","volume":"2 1","pages":"54 - 62"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/21628130.2015.1040618","citationCount":"4","resultStr":"{\"title\":\"An integrative exploratory analysis of –omics data from the ICGC cancer genomes lung adenocarcinoma study\",\"authors\":\"S. Sikdar, Hyoyoung Choo Wosoba, Younathan Abdia, S. Dutta, R. Gill, S. Datta, S. Datta\",\"doi\":\"10.1080/21628130.2015.1040618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"It is known that all agents that cause cancer (carcinogens) also cause a change in the DNA sequence. In order to identify such often subtle changes, we attempt to integrate multiple molecular profile data sets released by the International Cancer Genome Consortium (ICGC). The list of data sets includes matched gene and microRNA expression profiles, somatic copy number variation, DNA methylation, and protein expression profiles for lung adenocarcinoma patients receiving treatments. We consider both unsupervised and supervised learning techniques (clustering and penalized regression) to identify interesting molecular markers corresponding to each type of –omics profiles that can differentiate patients. Associations between important markers of 2 types have been studied. An adaptive ensemble binary regression model has been presented that uses the entirety of available –omics profiles leading to a more accurate clinical prognosis for the patients in the given sample. This integrated study provides a more comprehensive picture of lung adenocarcinoma.\",\"PeriodicalId\":90057,\"journal\":{\"name\":\"Systems biomedicine (Austin, Tex.)\",\"volume\":\"2 1\",\"pages\":\"54 - 62\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1080/21628130.2015.1040618\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Systems biomedicine (Austin, Tex.)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/21628130.2015.1040618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Systems biomedicine (Austin, Tex.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/21628130.2015.1040618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An integrative exploratory analysis of –omics data from the ICGC cancer genomes lung adenocarcinoma study
It is known that all agents that cause cancer (carcinogens) also cause a change in the DNA sequence. In order to identify such often subtle changes, we attempt to integrate multiple molecular profile data sets released by the International Cancer Genome Consortium (ICGC). The list of data sets includes matched gene and microRNA expression profiles, somatic copy number variation, DNA methylation, and protein expression profiles for lung adenocarcinoma patients receiving treatments. We consider both unsupervised and supervised learning techniques (clustering and penalized regression) to identify interesting molecular markers corresponding to each type of –omics profiles that can differentiate patients. Associations between important markers of 2 types have been studied. An adaptive ensemble binary regression model has been presented that uses the entirety of available –omics profiles leading to a more accurate clinical prognosis for the patients in the given sample. This integrated study provides a more comprehensive picture of lung adenocarcinoma.