Data quality model for assessing public COVID-19 big datasets.

IF 2.5 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Alladoumbaye Ngueilbaye, Joshua Zhexue Huang, Mehak Khan, Hongzhi Wang
{"title":"Data quality model for assessing public COVID-19 big datasets.","authors":"Alladoumbaye Ngueilbaye, Joshua Zhexue Huang, Mehak Khan, Hongzhi Wang","doi":"10.1007/s11227-023-05410-0","DOIUrl":null,"url":null,"abstract":"<p><p>For decision-making support and evidence based on healthcare, high quality data are crucial, particularly if the emphasized knowledge is lacking. For public health practitioners and researchers, the reporting of COVID-19 data need to be accurate and easily available. Each nation has a system in place for reporting COVID-19 data, albeit these systems' efficacy has not been thoroughly evaluated. However, the current COVID-19 pandemic has shown widespread flaws in data quality. We propose a data quality model (canonical data model, four adequacy levels, and Benford's law) to assess the quality issue of COVID-19 data reporting carried out by the World Health Organization (WHO) in the six Central African Economic and Monitory Community (CEMAC) region countries between March 6,2020, and June 22, 2022, and suggest potential solutions. These levels of data quality sufficiency can be interpreted as dependability indicators and sufficiency of Big Dataset inspection. This model effectively identified the quality of the entry data for big dataset analytics. The future development of this model requires scholars and institutions from all sectors to deepen their understanding of its core concepts, improve integration with other data processing technologies, and broaden the scope of its applications.</p>","PeriodicalId":50034,"journal":{"name":"Journal of Supercomputing","volume":" ","pages":"1-33"},"PeriodicalIF":2.5000,"publicationDate":"2023-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10230148/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Supercomputing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s11227-023-05410-0","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

For decision-making support and evidence based on healthcare, high quality data are crucial, particularly if the emphasized knowledge is lacking. For public health practitioners and researchers, the reporting of COVID-19 data need to be accurate and easily available. Each nation has a system in place for reporting COVID-19 data, albeit these systems' efficacy has not been thoroughly evaluated. However, the current COVID-19 pandemic has shown widespread flaws in data quality. We propose a data quality model (canonical data model, four adequacy levels, and Benford's law) to assess the quality issue of COVID-19 data reporting carried out by the World Health Organization (WHO) in the six Central African Economic and Monitory Community (CEMAC) region countries between March 6,2020, and June 22, 2022, and suggest potential solutions. These levels of data quality sufficiency can be interpreted as dependability indicators and sufficiency of Big Dataset inspection. This model effectively identified the quality of the entry data for big dataset analytics. The future development of this model requires scholars and institutions from all sectors to deepen their understanding of its core concepts, improve integration with other data processing technologies, and broaden the scope of its applications.

Abstract Image

Abstract Image

Abstract Image

用于评估公共新冠肺炎大数据集的数据质量模型。
对于基于医疗保健的决策支持和证据,高质量的数据至关重要,尤其是在缺乏所强调的知识的情况下。对于公共卫生从业者和研究人员来说,新冠肺炎数据的报告需要准确且易于获得。每个国家都有一个报告新冠肺炎数据的系统,尽管这些系统的功效尚未得到彻底评估。然而,当前的新冠肺炎疫情显示出数据质量方面的普遍缺陷。我们提出了一个数据质量模型(规范数据模型、四个充分性水平和本福德定律),以评估世界卫生组织(世界卫生组织)在2020年3月6日至2022年6月22日期间在中非经济和监测共同体(中非经货共同体)六个区域国家进行的新冠肺炎数据报告的质量问题,并提出潜在的解决方案。这些数据质量充分性水平可以解释为大数据集检查的可靠性指标和充分性。该模型有效地确定了大数据集分析的入口数据的质量。该模型的未来发展需要各界学者和机构加深对其核心概念的理解,提高与其他数据处理技术的集成度,拓宽其应用范围。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Supercomputing
Journal of Supercomputing 工程技术-工程:电子与电气
CiteScore
6.30
自引率
12.10%
发文量
734
审稿时长
13 months
期刊介绍: The Journal of Supercomputing publishes papers on the technology, architecture and systems, algorithms, languages and programs, performance measures and methods, and applications of all aspects of Supercomputing. Tutorial and survey papers are intended for workers and students in the fields associated with and employing advanced computer systems. The journal also publishes letters to the editor, especially in areas relating to policy, succinct statements of paradoxes, intuitively puzzling results, partial results and real needs. Published theoretical and practical papers are advanced, in-depth treatments describing new developments and new ideas. Each includes an introduction summarizing prior, directly pertinent work that is useful for the reader to understand, in order to appreciate the advances being described.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信