Nikki Aaron, Prabhjot Singh, Siddharth Surapaneni, Joseph Wysocki
{"title":"LGL白血病的突变热点检测","authors":"Nikki Aaron, Prabhjot Singh, Siddharth Surapaneni, Joseph Wysocki","doi":"10.1109/SIEDS52267.2021.9483797","DOIUrl":null,"url":null,"abstract":"Cancer genomics has been focused primarily on identifying and studying mutations that are over-represented in known genes. This project applied methods to scan through entire chromosomes and label these loci as \"genomic probabilistic hotspots\" (GPHs). A GPH is defined as any area on a patient’s chromosome where the observed rate of mutations over positions of a given chromosome window far exceeds what would be expected from random variation. The approach is then applied to 39 patients diagnosed with large granular lymphocyte (LGL) leukemia - a rare form of blood cancer. In order to calculate expected mutation rates in non-LGL patients, data were obtained from the 1000 Genome Project. A negative binomial test was employed to isolate specific GPHs where the distribution of mutations within the LGL patient sample was significantly high. The Negative Binomial approach identified a median of 1 to 2 patient hotspots per chromosome with a mean Jaccard’s distance between patients being 0.90. The KDE method found a median of 40 hotspots with wider span resulting in a mean Jaccard’s distance of 0.43. The results from the Negative Binomial approach indicated heterogeneity between hotspot locations, whereas KDE results were more homogeneous. Negative binomial is best for pinpointing the most significantly dense regions, whereas KDE is best for identifying all broad regions that are more mutated than a reference. These new, gene-agnostic approaches provide novel methods to search chromosomes for mutational abnormalities and can be generalized and scaled to any clinical syndrome. Future directions include extension of the GPH method across genomes, developing a robust library of disease- and/or model species-specific hotspot profiles. These may serve as reference guides in studies seeking to understand the exact biochemical processes driving the onset and progression of rare cancers.","PeriodicalId":426747,"journal":{"name":"2021 Systems and Information Engineering Design Symposium (SIEDS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mutational Hotspot Detection in LGL Leukemia\",\"authors\":\"Nikki Aaron, Prabhjot Singh, Siddharth Surapaneni, Joseph Wysocki\",\"doi\":\"10.1109/SIEDS52267.2021.9483797\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Cancer genomics has been focused primarily on identifying and studying mutations that are over-represented in known genes. This project applied methods to scan through entire chromosomes and label these loci as \\\"genomic probabilistic hotspots\\\" (GPHs). A GPH is defined as any area on a patient’s chromosome where the observed rate of mutations over positions of a given chromosome window far exceeds what would be expected from random variation. The approach is then applied to 39 patients diagnosed with large granular lymphocyte (LGL) leukemia - a rare form of blood cancer. In order to calculate expected mutation rates in non-LGL patients, data were obtained from the 1000 Genome Project. A negative binomial test was employed to isolate specific GPHs where the distribution of mutations within the LGL patient sample was significantly high. The Negative Binomial approach identified a median of 1 to 2 patient hotspots per chromosome with a mean Jaccard’s distance between patients being 0.90. The KDE method found a median of 40 hotspots with wider span resulting in a mean Jaccard’s distance of 0.43. The results from the Negative Binomial approach indicated heterogeneity between hotspot locations, whereas KDE results were more homogeneous. Negative binomial is best for pinpointing the most significantly dense regions, whereas KDE is best for identifying all broad regions that are more mutated than a reference. These new, gene-agnostic approaches provide novel methods to search chromosomes for mutational abnormalities and can be generalized and scaled to any clinical syndrome. Future directions include extension of the GPH method across genomes, developing a robust library of disease- and/or model species-specific hotspot profiles. These may serve as reference guides in studies seeking to understand the exact biochemical processes driving the onset and progression of rare cancers.\",\"PeriodicalId\":426747,\"journal\":{\"name\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS52267.2021.9483797\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS52267.2021.9483797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cancer genomics has been focused primarily on identifying and studying mutations that are over-represented in known genes. This project applied methods to scan through entire chromosomes and label these loci as "genomic probabilistic hotspots" (GPHs). A GPH is defined as any area on a patient’s chromosome where the observed rate of mutations over positions of a given chromosome window far exceeds what would be expected from random variation. The approach is then applied to 39 patients diagnosed with large granular lymphocyte (LGL) leukemia - a rare form of blood cancer. In order to calculate expected mutation rates in non-LGL patients, data were obtained from the 1000 Genome Project. A negative binomial test was employed to isolate specific GPHs where the distribution of mutations within the LGL patient sample was significantly high. The Negative Binomial approach identified a median of 1 to 2 patient hotspots per chromosome with a mean Jaccard’s distance between patients being 0.90. The KDE method found a median of 40 hotspots with wider span resulting in a mean Jaccard’s distance of 0.43. The results from the Negative Binomial approach indicated heterogeneity between hotspot locations, whereas KDE results were more homogeneous. Negative binomial is best for pinpointing the most significantly dense regions, whereas KDE is best for identifying all broad regions that are more mutated than a reference. These new, gene-agnostic approaches provide novel methods to search chromosomes for mutational abnormalities and can be generalized and scaled to any clinical syndrome. Future directions include extension of the GPH method across genomes, developing a robust library of disease- and/or model species-specific hotspot profiles. These may serve as reference guides in studies seeking to understand the exact biochemical processes driving the onset and progression of rare cancers.