{"title":"基于机器学习的美国COVID-19大流行预测","authors":"Wencheng Zou","doi":"10.1109/ICPHDS51617.2020.00062","DOIUrl":null,"url":null,"abstract":"The COVID-19 pandemic situation is aggravating in the United States, and due to its high infection rate, it seems hard to predict the number of infected people. In this research, we carry out machine learning methods such as linear regression and neuron networks to make predictions on the number of positive cases of COVID-19. We also collect state-level data to generate predictions by linear regression and we find that the data from two states-Georgia and Massachusetts-can be used to predict the number of infections nationwide. After dividing our dataset into 3 consecutive time periods and training different models to fit each corresponding data, we compare mean square error (MSE) values to draw the conclusion that for the first time period the Lasso performs better than Ridge, for the second time period the Ridge and Lasso behave similarly on our data, and for the third period time the Ridge fits our data better than Lasso. Furthermore, from the general perspective regardless of 3 time periods we find that single variable linear regression performs more accurately than fully connected neural network.","PeriodicalId":308387,"journal":{"name":"2020 International Conference on Public Health and Data Science (ICPHDS)","volume":"35 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"The COVID-19 Pandemic Prediction in the US Based on Machine Learning\",\"authors\":\"Wencheng Zou\",\"doi\":\"10.1109/ICPHDS51617.2020.00062\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The COVID-19 pandemic situation is aggravating in the United States, and due to its high infection rate, it seems hard to predict the number of infected people. In this research, we carry out machine learning methods such as linear regression and neuron networks to make predictions on the number of positive cases of COVID-19. We also collect state-level data to generate predictions by linear regression and we find that the data from two states-Georgia and Massachusetts-can be used to predict the number of infections nationwide. After dividing our dataset into 3 consecutive time periods and training different models to fit each corresponding data, we compare mean square error (MSE) values to draw the conclusion that for the first time period the Lasso performs better than Ridge, for the second time period the Ridge and Lasso behave similarly on our data, and for the third period time the Ridge fits our data better than Lasso. Furthermore, from the general perspective regardless of 3 time periods we find that single variable linear regression performs more accurately than fully connected neural network.\",\"PeriodicalId\":308387,\"journal\":{\"name\":\"2020 International Conference on Public Health and Data Science (ICPHDS)\",\"volume\":\"35 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Public Health and Data Science (ICPHDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPHDS51617.2020.00062\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Public Health and Data Science (ICPHDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPHDS51617.2020.00062","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The COVID-19 Pandemic Prediction in the US Based on Machine Learning
The COVID-19 pandemic situation is aggravating in the United States, and due to its high infection rate, it seems hard to predict the number of infected people. In this research, we carry out machine learning methods such as linear regression and neuron networks to make predictions on the number of positive cases of COVID-19. We also collect state-level data to generate predictions by linear regression and we find that the data from two states-Georgia and Massachusetts-can be used to predict the number of infections nationwide. After dividing our dataset into 3 consecutive time periods and training different models to fit each corresponding data, we compare mean square error (MSE) values to draw the conclusion that for the first time period the Lasso performs better than Ridge, for the second time period the Ridge and Lasso behave similarly on our data, and for the third period time the Ridge fits our data better than Lasso. Furthermore, from the general perspective regardless of 3 time periods we find that single variable linear regression performs more accurately than fully connected neural network.