{"title":"Heart Disease Prediction using Chi-Square Test and Linear Regression","authors":"Dinesh Kalla, Arvind Chandrasekaran","doi":"10.5121/csit.2023.130712","DOIUrl":null,"url":null,"abstract":"Heart disease is most common disease reported currently in the United States among both the genders and according to official statistics about fifty percent of the American population is suffering from some form of cardiovascular disease. This paper performs chi square tests and linear regression analysis to predict heart disease based on the symptoms like chest pain and dizziness. This paper will help healthcare sectors to provide better assistance for patients suffering from heart disease by predicting it in beginning stage of disease. Chi square test is conducted to identify whether there is a relation between chest pain and heart disease cases in the United States by analyzing heart disease dataset from IEEE Data Port. The test results and analysis show that males in the United States are most likely to develop heart disease with the symptoms like chest pain, dizziness, shortness of breath, fatigue, and nausea. This test also shows that there is a week corelation of 0.5 is identified which shows people with all ages including teens can face heart diseases and its prevalence increase with age. Also, the tests indicate that 90 percent of the participant who are facing severe chest pain is suffering from heart disease where majority of the successful heart disease identified is in males and only 10 percent participants are identified as healthy. The evaluated p-values are much greater than the statistical threshold of 0.05 which concludes factors like sex, Exercise angina, Cholesterol, old peak, ST_Slope, obesity, and blood sugar play significant role in onset of cardiovascular disease. We have tested the dataset with prediction model built on logistic regression and observed an accuracy of 85.12 percent.","PeriodicalId":42597,"journal":{"name":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2023-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ADCAIJ-Advances in Distributed Computing and Artificial Intelligence Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/csit.2023.130712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Heart disease is most common disease reported currently in the United States among both the genders and according to official statistics about fifty percent of the American population is suffering from some form of cardiovascular disease. This paper performs chi square tests and linear regression analysis to predict heart disease based on the symptoms like chest pain and dizziness. This paper will help healthcare sectors to provide better assistance for patients suffering from heart disease by predicting it in beginning stage of disease. Chi square test is conducted to identify whether there is a relation between chest pain and heart disease cases in the United States by analyzing heart disease dataset from IEEE Data Port. The test results and analysis show that males in the United States are most likely to develop heart disease with the symptoms like chest pain, dizziness, shortness of breath, fatigue, and nausea. This test also shows that there is a week corelation of 0.5 is identified which shows people with all ages including teens can face heart diseases and its prevalence increase with age. Also, the tests indicate that 90 percent of the participant who are facing severe chest pain is suffering from heart disease where majority of the successful heart disease identified is in males and only 10 percent participants are identified as healthy. The evaluated p-values are much greater than the statistical threshold of 0.05 which concludes factors like sex, Exercise angina, Cholesterol, old peak, ST_Slope, obesity, and blood sugar play significant role in onset of cardiovascular disease. We have tested the dataset with prediction model built on logistic regression and observed an accuracy of 85.12 percent.