Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic

Q1 Decision Sciences

Annals of Data Science Pub Date : 2024-11-18 DOI:10.1007/s40745-024-00583-8

Shalabh, Subhra Sankar Dhar, Sabara Parshad Rajeshbhai

{"title":"Statistical Data-Driven Modelling and Forecasting: An Application to COVID-19 Pandemic","authors":"Shalabh, Subhra Sankar Dhar, Sabara Parshad Rajeshbhai","doi":"10.1007/s40745-024-00583-8","DOIUrl":null,"url":null,"abstract":"<div><p>One of the key objectives of statistics is to provide a model compatible with the data generated by an unknown random process. Often, it happens that the unknown process is intractable, and no prior data or information associated with the unknown process is available. Under such circumstances, well-known techniques like regression modelling techniques may not work. As a result, an alternative approach may be to observe the general features of the process from the available data. Afterward, a suitable statistical distribution, like a mixture of certain distributions, can be fitted to the existing available data, and future observations can be predicted using this fitting. For example, one may consider the prediction related to the COVID-19 pandemic. As it occurred for the first time, no prior data was available to apprehend the behaviour and progression of the COVID-19 pandemic. For such cases, a data-based statistical modelling procedure can be adopted to predict future occurrences based on a small data set. This article presents such an application-oriented, data-based statistical modelling procedure with an implementation on the COVID-19 data. The proposed procedure can be used for a wide range of modelling and forecasting of future events.</p></div>","PeriodicalId":36280,"journal":{"name":"Annals of Data Science","volume":"12 5","pages":"1747 - 1770"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Data Science","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s40745-024-00583-8","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Decision Sciences","Score":null,"Total":0}

引用次数: 0

Abstract

One of the key objectives of statistics is to provide a model compatible with the data generated by an unknown random process. Often, it happens that the unknown process is intractable, and no prior data or information associated with the unknown process is available. Under such circumstances, well-known techniques like regression modelling techniques may not work. As a result, an alternative approach may be to observe the general features of the process from the available data. Afterward, a suitable statistical distribution, like a mixture of certain distributions, can be fitted to the existing available data, and future observations can be predicted using this fitting. For example, one may consider the prediction related to the COVID-19 pandemic. As it occurred for the first time, no prior data was available to apprehend the behaviour and progression of the COVID-19 pandemic. For such cases, a data-based statistical modelling procedure can be adopted to predict future occurrences based on a small data set. This article presents such an application-oriented, data-based statistical modelling procedure with an implementation on the COVID-19 data. The proposed procedure can be used for a wide range of modelling and forecasting of future events.

查看原文本刊更多论文

统计数据驱动的建模与预测：在COVID-19大流行中的应用

统计学的主要目标之一是提供一个与未知随机过程产生的数据兼容的模型。通常，未知过程是难以处理的，并且没有与未知过程相关的先前数据或信息可用。在这种情况下，众所周知的回归建模技术可能不起作用。因此，另一种方法可能是从现有数据中观察过程的一般特征。然后，一个合适的统计分布，如某些分布的混合，可以拟合到现有的可用数据，并使用这种拟合可以预测未来的观察结果。例如，可以考虑与COVID-19大流行相关的预测。由于这是首次发生，因此没有可用的先前数据来了解COVID-19大流行的行为和进展。对于这种情况，可以采用基于数据的统计建模程序，根据少量数据集预测未来的发生情况。本文提出了一种面向应用、基于数据的统计建模方法，并以新冠肺炎疫情数据为例进行了实现。所建议的程序可广泛用于未来事件的建模和预测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Data Science Decision Sciences-Statistics, Probability and Uncertainty

CiteScore

6.50

自引率

0.00%

发文量

期刊介绍： Annals of Data Science (ADS) publishes cutting-edge research findings, experimental results and case studies of data science. Although Data Science is regarded as an interdisciplinary field of using mathematics, statistics, databases, data mining, high-performance computing, knowledge management and virtualization to discover knowledge from Big Data, it should have its own scientific contents, such as axioms, laws and rules, which are fundamentally important for experts in different fields to explore their own interests from Big Data. ADS encourages contributors to address such challenging problems at this exchange platform. At present, how to discover knowledge from heterogeneous data under Big Data environment needs to be addressed. ADS is a series of volumes edited by either the editorial office or guest editors. Guest editors will be responsible for call-for-papers and the review process for high-quality contributions in their volumes.