{"title":"开发数据管道,以提高夏洛茨维尔开放数据门户的可访问性和利用率","authors":"L. Beane, Elena Gillis, Raf Alvarado, C. Wylie","doi":"10.1109/SIEDS.2019.8735653","DOIUrl":null,"url":null,"abstract":"To improve democratic engagement between the people and the government, the city of Charlottesville put forward a proposition to construct an online portal that would contain data from the city departments that is considered public by nature. This move was intended to promote the ease of access to data pertinent to ongoing policy debates in the city and incentivize the public to contribute to the policy-making process with informed participation. Such efforts, while successful at their start, have gradually stagnated, and the end objective of the portal has not been reached. In this paper we identify possible reasons for this stagnation – inconsistent formatting of the datasets, variables that are not meant for human legibility, and limited data with disproportional representation from the city departments. We then propose a data pipeline that serves as a tool to extract utility from the data. It does so by converting the datasets into a consistent format, merges the datasets, and allows for creation of simple visualizations. The pipeline acts as a link between the raw data published by the government units and the city by increasing its interpretability and legibility and outputting results that are easily relatable to the policy issues at hand. We demonstrate this by analyzing datasets for crime and real estate and relating our findings to the affordable housing debate.","PeriodicalId":265421,"journal":{"name":"2019 Systems and Information Engineering Design Symposium (SIEDS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Developing a data pipeline to improve accessibility and utilization of Charlottesville's Open Data Portal\",\"authors\":\"L. Beane, Elena Gillis, Raf Alvarado, C. Wylie\",\"doi\":\"10.1109/SIEDS.2019.8735653\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To improve democratic engagement between the people and the government, the city of Charlottesville put forward a proposition to construct an online portal that would contain data from the city departments that is considered public by nature. This move was intended to promote the ease of access to data pertinent to ongoing policy debates in the city and incentivize the public to contribute to the policy-making process with informed participation. Such efforts, while successful at their start, have gradually stagnated, and the end objective of the portal has not been reached. In this paper we identify possible reasons for this stagnation – inconsistent formatting of the datasets, variables that are not meant for human legibility, and limited data with disproportional representation from the city departments. We then propose a data pipeline that serves as a tool to extract utility from the data. It does so by converting the datasets into a consistent format, merges the datasets, and allows for creation of simple visualizations. The pipeline acts as a link between the raw data published by the government units and the city by increasing its interpretability and legibility and outputting results that are easily relatable to the policy issues at hand. We demonstrate this by analyzing datasets for crime and real estate and relating our findings to the affordable housing debate.\",\"PeriodicalId\":265421,\"journal\":{\"name\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS.2019.8735653\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS.2019.8735653","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Developing a data pipeline to improve accessibility and utilization of Charlottesville's Open Data Portal
To improve democratic engagement between the people and the government, the city of Charlottesville put forward a proposition to construct an online portal that would contain data from the city departments that is considered public by nature. This move was intended to promote the ease of access to data pertinent to ongoing policy debates in the city and incentivize the public to contribute to the policy-making process with informed participation. Such efforts, while successful at their start, have gradually stagnated, and the end objective of the portal has not been reached. In this paper we identify possible reasons for this stagnation – inconsistent formatting of the datasets, variables that are not meant for human legibility, and limited data with disproportional representation from the city departments. We then propose a data pipeline that serves as a tool to extract utility from the data. It does so by converting the datasets into a consistent format, merges the datasets, and allows for creation of simple visualizations. The pipeline acts as a link between the raw data published by the government units and the city by increasing its interpretability and legibility and outputting results that are easily relatable to the policy issues at hand. We demonstrate this by analyzing datasets for crime and real estate and relating our findings to the affordable housing debate.