Felipe F. Vasconcelos, João V. S. Tavares, Matheus G. S. Oliveira, Fábio Coutinho, João Paulo Clarindo
{"title":"CandiDATA: an enhanced dataset for data analysis of elections in Brazil from 1945 to 2020","authors":"Felipe F. Vasconcelos, João V. S. Tavares, Matheus G. S. Oliveira, Fábio Coutinho, João Paulo Clarindo","doi":"10.5753/jidm.2022.2361","DOIUrl":null,"url":null,"abstract":"The Brazilian Superior Electoral Court (TSE) keeps data on elections that have taken place in Brazil since 1933. These data constitute an important collection serving as a reference for works in several research areas. However, this collection is not fully exploited due to some problems, such as missing and non-standard data, making analysis and integration with external databases difficult. Previous works built limited datasets and tools because of these problems as they only include data since the 1998 election, disregarding the election years from 1945 and 1996. This work discusses the steps to create CandiDATA – a standardized and enhanced dataset from TSE data, including a toolkit of webscrapping and data visualization. CandiDATA is available in open format and covers the election period between 1945 and 2020.","PeriodicalId":301338,"journal":{"name":"J. Inf. Data Manag.","volume":"46 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-08-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Inf. Data Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/jidm.2022.2361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The Brazilian Superior Electoral Court (TSE) keeps data on elections that have taken place in Brazil since 1933. These data constitute an important collection serving as a reference for works in several research areas. However, this collection is not fully exploited due to some problems, such as missing and non-standard data, making analysis and integration with external databases difficult. Previous works built limited datasets and tools because of these problems as they only include data since the 1998 election, disregarding the election years from 1945 and 1996. This work discusses the steps to create CandiDATA – a standardized and enhanced dataset from TSE data, including a toolkit of webscrapping and data visualization. CandiDATA is available in open format and covers the election period between 1945 and 2020.