美国COVID-19综合数据库

2021 Systems and Information Engineering Design Symposium (SIEDS) Pub Date : 2021-04-30 DOI:10.1109/SIEDS52267.2021.9483754

Gunnar Sundberg, Bayazit Karaman

{"title":"美国COVID-19综合数据库","authors":"Gunnar Sundberg, Bayazit Karaman","doi":"10.1109/SIEDS52267.2021.9483754","DOIUrl":null,"url":null,"abstract":"The diversity in responses to and conditions resulting from the COVID-19 pandemic in the United States has provided rich data for researchers to study, especially as the pandemic continues to progress. With more than a full year of data available in different regions and at different granularities, methods of analysis requiring larger datasets are now worth examining or refining. Furthermore, as the United States seeks to move away from national and state-wide policies into approaches focused on individual communities, open data must be provided at both the state and county levels. In this paper, a comprehensive database encompassing COVID-19 data and a large body of related data is proposed. The database includes data on cases and deaths, testing, mobility, demographics, weather, and more at both the US state and county levels. The system was implemented using the Python framework Django and the high-performance RDBMS PostgreSQL. A data-processing pipeline was implemented using the asynchronous task library Celery to gather and clean data from various verified sources. This database has been used to build a web application for concise reporting and an open API for public access to the data. A reference web application using the API is currently available at www.bigdatacovid.com, and the API is available at www.bigdatacovid.com/api/v1, with API documentation available on the website.","PeriodicalId":426747,"journal":{"name":"2021 Systems and Information Engineering Design Symposium (SIEDS)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comprehensive COVID-19 Database for the United States\",\"authors\":\"Gunnar Sundberg, Bayazit Karaman\",\"doi\":\"10.1109/SIEDS52267.2021.9483754\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The diversity in responses to and conditions resulting from the COVID-19 pandemic in the United States has provided rich data for researchers to study, especially as the pandemic continues to progress. With more than a full year of data available in different regions and at different granularities, methods of analysis requiring larger datasets are now worth examining or refining. Furthermore, as the United States seeks to move away from national and state-wide policies into approaches focused on individual communities, open data must be provided at both the state and county levels. In this paper, a comprehensive database encompassing COVID-19 data and a large body of related data is proposed. The database includes data on cases and deaths, testing, mobility, demographics, weather, and more at both the US state and county levels. The system was implemented using the Python framework Django and the high-performance RDBMS PostgreSQL. A data-processing pipeline was implemented using the asynchronous task library Celery to gather and clean data from various verified sources. This database has been used to build a web application for concise reporting and an open API for public access to the data. A reference web application using the API is currently available at www.bigdatacovid.com, and the API is available at www.bigdatacovid.com/api/v1, with API documentation available on the website.\",\"PeriodicalId\":426747,\"journal\":{\"name\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS52267.2021.9483754\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS52267.2021.9483754","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

美国对COVID-19大流行的反应和造成的情况的多样性为研究人员提供了丰富的数据，特别是在大流行继续发展的情况下。有了不同地区和不同粒度的一整年以上的数据，需要更大数据集的分析方法现在值得研究或改进。此外，随着美国寻求从全国性和全州范围的政策转向关注个人社区的方法，必须在州和县两级提供开放数据。本文提出了一个包含COVID-19数据和大量相关数据的综合数据库。该数据库包括美国州和县两级的病例和死亡、检测、流动性、人口统计、天气等数据。系统采用Python框架Django和高性能RDBMS PostgreSQL实现。使用异步任务库芹菜实现了一个数据处理管道，用于收集和清理来自各种已验证源的数据。这个数据库被用来构建一个web应用程序，用于简洁的报告和一个开放的API，供公众访问数据。使用该API的参考web应用程序目前可在www.bigdatacovid.com上获得，该API可在www.bigdatacovid.com/api/v1上获得，并在网站上提供API文档。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comprehensive COVID-19 Database for the United States

The diversity in responses to and conditions resulting from the COVID-19 pandemic in the United States has provided rich data for researchers to study, especially as the pandemic continues to progress. With more than a full year of data available in different regions and at different granularities, methods of analysis requiring larger datasets are now worth examining or refining. Furthermore, as the United States seeks to move away from national and state-wide policies into approaches focused on individual communities, open data must be provided at both the state and county levels. In this paper, a comprehensive database encompassing COVID-19 data and a large body of related data is proposed. The database includes data on cases and deaths, testing, mobility, demographics, weather, and more at both the US state and county levels. The system was implemented using the Python framework Django and the high-performance RDBMS PostgreSQL. A data-processing pipeline was implemented using the asynchronous task library Celery to gather and clean data from various verified sources. This database has been used to build a web application for concise reporting and an open API for public access to the data. A reference web application using the API is currently available at www.bigdatacovid.com, and the API is available at www.bigdatacovid.com/api/v1, with API documentation available on the website.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 Systems and Information Engineering Design Symposium (SIEDS)

自引率

0.00%

发文量