{"title":"Detection of Novel Gene Biomarkers in non-small cell lung cancer using integrated approaches in DNA methylation expression","authors":"Tun-Wen Pai","doi":"10.1145/3571532.3571534","DOIUrl":null,"url":null,"abstract":"Lung cancer is one of primal and ubiquitous cause of cancer related fatalities in the world. Leading cause of these fatalities is non-small cell lung cancer (NSCLC) with a percentage of 85. The major subtypes of NSCLC are Lung Adenocarcinoma (LUAD), Lung Squamous cell carcinoma (LUSC), and Large cell carcinoma. Early-stage surgical detection and removal of tumor offers a favorable prognosis and better survival rates. However, more than 75% patients have stage III/IV at the time of diagnosis and despite advanced major developments in oncology survival rates remain poor. Carcinogens produce widespread DNA methylation changes within cells. These changes are characterized by globally hyper or hypo methylated regions around CpG islands. Many of these changes occur early in tumorigenesis and are highly prevalent across a tumor type. In this study, DNA methylation profiles were extracted from TCGA for 418 LUAD and 370 LUSC tissue samples from patients compared with 32 and 42 non-malignant ones respectively. A standard pipeline was performed to consider significant differentially methylated sites as primary biomarkers, while secondary biomarkers were obtained from associated comorbidities and associated disease genes from meta-analysis study. Concordant candidates were utilized for NSCLC relevant biomarker candidates. Gene ontology annotations were used to calculate gene-pair distance matrix for all candidate biomarkers. Clustering algorithms were utilized to categorize candidate genes into different functional groups using gene distance matrix. There were 35 CpG loci identified as key biomarkers by comparing TCGA training cohort with GEO testing cohort from these functional groups.","PeriodicalId":355088,"journal":{"name":"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science","volume":"100 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 11th International Conference on Bioinformatics and Biomedical Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3571532.3571534","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Lung cancer is one of primal and ubiquitous cause of cancer related fatalities in the world. Leading cause of these fatalities is non-small cell lung cancer (NSCLC) with a percentage of 85. The major subtypes of NSCLC are Lung Adenocarcinoma (LUAD), Lung Squamous cell carcinoma (LUSC), and Large cell carcinoma. Early-stage surgical detection and removal of tumor offers a favorable prognosis and better survival rates. However, more than 75% patients have stage III/IV at the time of diagnosis and despite advanced major developments in oncology survival rates remain poor. Carcinogens produce widespread DNA methylation changes within cells. These changes are characterized by globally hyper or hypo methylated regions around CpG islands. Many of these changes occur early in tumorigenesis and are highly prevalent across a tumor type. In this study, DNA methylation profiles were extracted from TCGA for 418 LUAD and 370 LUSC tissue samples from patients compared with 32 and 42 non-malignant ones respectively. A standard pipeline was performed to consider significant differentially methylated sites as primary biomarkers, while secondary biomarkers were obtained from associated comorbidities and associated disease genes from meta-analysis study. Concordant candidates were utilized for NSCLC relevant biomarker candidates. Gene ontology annotations were used to calculate gene-pair distance matrix for all candidate biomarkers. Clustering algorithms were utilized to categorize candidate genes into different functional groups using gene distance matrix. There were 35 CpG loci identified as key biomarkers by comparing TCGA training cohort with GEO testing cohort from these functional groups.