{"title":"青年白血病患者的生存分析","authors":"T. Williams","doi":"10.1137/19S019085","DOIUrl":null,"url":null,"abstract":"Faculty Advisors: Dr. Keshav P. Pokhrel 4, Dr. Taysseer Sharaf 5 ————————————————————————————— Abstract With cancer as a leading cause of death in the United States, the study of its related data is imperative due to the potential patient benefits. This paper examines the Surveillance, Epidemiology, and End Results program (SEER) research data of reported cancer diagnoses from 1973-2014 for the incidence of leukemia in young (019 years) patients in the United States. The aim is to identify variables, such as prior cancers and treatment, with a unique impact on survival time and five-year survival probabilities using visualizations and different machine learning techniques. This goal culminated in building multiple models to predict the patient's hazard. The two most insightful models constructed were both neural networks. One network used discrete survival time as a covariate to predict one conditional hazard per patient. The prediction rate is nearly 95% for testing datasets. The other network built hazards for discrete time intervals without survival time as a covariate and predicted with lower accuracy, but captured variable effects from initial testing better.","PeriodicalId":93373,"journal":{"name":"SIAM undergraduate research online","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2019-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Survival Analysis of Young Leukemia Patients\",\"authors\":\"T. Williams\",\"doi\":\"10.1137/19S019085\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Faculty Advisors: Dr. Keshav P. Pokhrel 4, Dr. Taysseer Sharaf 5 ————————————————————————————— Abstract With cancer as a leading cause of death in the United States, the study of its related data is imperative due to the potential patient benefits. This paper examines the Surveillance, Epidemiology, and End Results program (SEER) research data of reported cancer diagnoses from 1973-2014 for the incidence of leukemia in young (019 years) patients in the United States. The aim is to identify variables, such as prior cancers and treatment, with a unique impact on survival time and five-year survival probabilities using visualizations and different machine learning techniques. This goal culminated in building multiple models to predict the patient's hazard. The two most insightful models constructed were both neural networks. One network used discrete survival time as a covariate to predict one conditional hazard per patient. The prediction rate is nearly 95% for testing datasets. The other network built hazards for discrete time intervals without survival time as a covariate and predicted with lower accuracy, but captured variable effects from initial testing better.\",\"PeriodicalId\":93373,\"journal\":{\"name\":\"SIAM undergraduate research online\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SIAM undergraduate research online\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1137/19S019085\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SIAM undergraduate research online","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1137/19S019085","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
教授顾问:Dr. Keshav P. Pokhrel 4, Dr. Taysseer Sharaf 5—————————————————————————————摘要在美国,癌症是导致死亡的主要原因,由于潜在的患者利益,对其相关数据的研究势在必行。本文研究了1973年至2014年美国年轻(019岁)患者白血病发病率的监测、流行病学和最终结果项目(SEER)研究数据。目的是使用可视化和不同的机器学习技术来识别变量,例如先前的癌症和治疗,对生存时间和五年生存概率有独特的影响。这个目标最终建立了多个模型来预测病人的危险。两个最有见地的模型都是神经网络。一个网络使用离散生存时间作为协变量来预测每位患者的一种条件风险。对于测试数据集,预测率接近95%。另一种网络在没有生存时间作为协变量的离散时间间隔中建立危险,预测精度较低,但从初始测试中捕获的变量效应更好。
Faculty Advisors: Dr. Keshav P. Pokhrel 4, Dr. Taysseer Sharaf 5 ————————————————————————————— Abstract With cancer as a leading cause of death in the United States, the study of its related data is imperative due to the potential patient benefits. This paper examines the Surveillance, Epidemiology, and End Results program (SEER) research data of reported cancer diagnoses from 1973-2014 for the incidence of leukemia in young (019 years) patients in the United States. The aim is to identify variables, such as prior cancers and treatment, with a unique impact on survival time and five-year survival probabilities using visualizations and different machine learning techniques. This goal culminated in building multiple models to predict the patient's hazard. The two most insightful models constructed were both neural networks. One network used discrete survival time as a covariate to predict one conditional hazard per patient. The prediction rate is nearly 95% for testing datasets. The other network built hazards for discrete time intervals without survival time as a covariate and predicted with lower accuracy, but captured variable effects from initial testing better.