{"title":"乳腺癌生存与复发数据集的探索性数据分析","authors":"E. J. Sweetlin, S. Saudia","doi":"10.1109/ICSPC51351.2021.9451811","DOIUrl":null,"url":null,"abstract":"Exploratory Data Analysis (EDA) is an important step in data analysis where it helps Data Analysts and researchers represent the data visually and dig patterns from data to obtain deep knowledge ingrained in the dataset. In medical domain, data analysis primarily helps physicians and researchers in the field of health care where data about the patients is available in the form of text and images. To take the right choice in terms of cure and treatment, the analysis of the previous records of the patients helps most of the time. This proposed Exploratory Data Analysis analyzes the attributes: Nottingham Prognostic Index (NPI), the Overall Survival Status (OSS) and Relapse Free Status (RFS) from the Metabric Breast Cancer dataset to determine the survivability and disease recurrence among different age categories of breast cancer patients for 5-year and 10-years. The EDA is done using the visualization tools of Python and the observations from the data are represented using relevant swarm plots and tabulations. Comparison is also made in terms of NPI to the survival rates with that of the survival rates as reported from the datasets Breast Test Wales and Grimsby Breast Unit.","PeriodicalId":182885,"journal":{"name":"2021 3rd International Conference on Signal Processing and Communication (ICPSC)","volume":"2000 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Exploratory Data Analysis on Breast cancer dataset about Survivability and Recurrence\",\"authors\":\"E. J. Sweetlin, S. Saudia\",\"doi\":\"10.1109/ICSPC51351.2021.9451811\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Exploratory Data Analysis (EDA) is an important step in data analysis where it helps Data Analysts and researchers represent the data visually and dig patterns from data to obtain deep knowledge ingrained in the dataset. In medical domain, data analysis primarily helps physicians and researchers in the field of health care where data about the patients is available in the form of text and images. To take the right choice in terms of cure and treatment, the analysis of the previous records of the patients helps most of the time. This proposed Exploratory Data Analysis analyzes the attributes: Nottingham Prognostic Index (NPI), the Overall Survival Status (OSS) and Relapse Free Status (RFS) from the Metabric Breast Cancer dataset to determine the survivability and disease recurrence among different age categories of breast cancer patients for 5-year and 10-years. The EDA is done using the visualization tools of Python and the observations from the data are represented using relevant swarm plots and tabulations. Comparison is also made in terms of NPI to the survival rates with that of the survival rates as reported from the datasets Breast Test Wales and Grimsby Breast Unit.\",\"PeriodicalId\":182885,\"journal\":{\"name\":\"2021 3rd International Conference on Signal Processing and Communication (ICPSC)\",\"volume\":\"2000 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-05-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 3rd International Conference on Signal Processing and Communication (ICPSC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSPC51351.2021.9451811\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 3rd International Conference on Signal Processing and Communication (ICPSC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSPC51351.2021.9451811","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Exploratory Data Analysis on Breast cancer dataset about Survivability and Recurrence
Exploratory Data Analysis (EDA) is an important step in data analysis where it helps Data Analysts and researchers represent the data visually and dig patterns from data to obtain deep knowledge ingrained in the dataset. In medical domain, data analysis primarily helps physicians and researchers in the field of health care where data about the patients is available in the form of text and images. To take the right choice in terms of cure and treatment, the analysis of the previous records of the patients helps most of the time. This proposed Exploratory Data Analysis analyzes the attributes: Nottingham Prognostic Index (NPI), the Overall Survival Status (OSS) and Relapse Free Status (RFS) from the Metabric Breast Cancer dataset to determine the survivability and disease recurrence among different age categories of breast cancer patients for 5-year and 10-years. The EDA is done using the visualization tools of Python and the observations from the data are represented using relevant swarm plots and tabulations. Comparison is also made in terms of NPI to the survival rates with that of the survival rates as reported from the datasets Breast Test Wales and Grimsby Breast Unit.