Ballot Tabulation Using Deep Learning

2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI) Pub Date : 2023-08-01 DOI:10.1109/IRI58017.2023.00026

Fei Zhao, Chengcui Zhang, Nitesh Saxena, D. Wallach, AKM SHAHARIAR AZAD RABBY

{"title":"Ballot Tabulation Using Deep Learning","authors":"Fei Zhao, Chengcui Zhang, Nitesh Saxena, D. Wallach, AKM SHAHARIAR AZAD RABBY","doi":"10.1109/IRI58017.2023.00026","DOIUrl":null,"url":null,"abstract":"Currently deployed election systems that scan and process hand-marked ballots are not sophisticated enough to handle marks insufficiently filled in (e.g., partially filled-in), improper marks (e.g., using check marks or crosses instead of filling in bubbles), or marks outside of bubbles, other than setting a threshold to detect whether the pixels inside bubbles are dark and dense enough to be counted as a vote. The current works along this line are still largely limited by their degree of automation and require substantial manpower for annotation and adjudication. In this study, we propose a highly automated deep learning (DL) mark segmentation model-based ballot tabulation assistant able to accurately identify legitimate ballot marks. For comparison purposes, a highly customized traditional computer vision (T-CV) mark segmentation-based method has also been developed to compare with the DL-based tabulator, with a detailed discussion included. Our experiments conducted on two real election datasets achieved the highest accuracy of 99.984% on ballot tabulation. In order to further enhance our DL model’s capability of detecting the marks that are underrepresented in training datasets, e.g., insufficiently or improperly filled marks, we propose a Siamese network architecture that enables our DL model to exploit the contrasting features between a hand-marked ballot image and its corresponding blank template image to detect marks. Without the need for extra data collection, by incorporating this novel network architecture, our DL model-based tabulation method not only achieved a higher accuracy score but also substantially reduced the overall false negative rate.","PeriodicalId":290818,"journal":{"name":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRI58017.2023.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Currently deployed election systems that scan and process hand-marked ballots are not sophisticated enough to handle marks insufficiently filled in (e.g., partially filled-in), improper marks (e.g., using check marks or crosses instead of filling in bubbles), or marks outside of bubbles, other than setting a threshold to detect whether the pixels inside bubbles are dark and dense enough to be counted as a vote. The current works along this line are still largely limited by their degree of automation and require substantial manpower for annotation and adjudication. In this study, we propose a highly automated deep learning (DL) mark segmentation model-based ballot tabulation assistant able to accurately identify legitimate ballot marks. For comparison purposes, a highly customized traditional computer vision (T-CV) mark segmentation-based method has also been developed to compare with the DL-based tabulator, with a detailed discussion included. Our experiments conducted on two real election datasets achieved the highest accuracy of 99.984% on ballot tabulation. In order to further enhance our DL model’s capability of detecting the marks that are underrepresented in training datasets, e.g., insufficiently or improperly filled marks, we propose a Siamese network architecture that enables our DL model to exploit the contrasting features between a hand-marked ballot image and its corresponding blank template image to detect marks. Without the need for extra data collection, by incorporating this novel network architecture, our DL model-based tabulation method not only achieved a higher accuracy score but also substantially reduced the overall false negative rate.

查看原文本刊更多论文

使用深度学习的选票制表

目前部署的扫描和处理手工标记选票的选举系统不够复杂，无法处理未充分填写的标记(例如，部分填写)，不适当的标记(例如，使用复选标记或叉号而不是填充气泡)，或气泡外的标记，而只能设置一个阈值来检测气泡内的像素是否足够暗和密集，以计算为投票。目前沿着这条路线进行的工作在很大程度上仍然受到自动化程度的限制，并且需要大量的人力进行注释和裁决。在这项研究中，我们提出了一个高度自动化的基于深度学习(DL)标记分割模型的选票制表助手，能够准确识别合法的选票标记。为了进行比较，还开发了一种高度定制的基于传统计算机视觉(T-CV)标记分割的方法，以与基于dl的制表器进行比较，并进行了详细的讨论。我们在两个真实的选举数据集上进行的实验，在选票制表上达到了99.984%的最高准确率。为了进一步增强我们的深度学习模型检测训练数据集中未充分表示的标记的能力，例如，标记不足或填充不当的标记，我们提出了一个Siamese网络架构，使我们的深度学习模型能够利用手工标记的选票图像与其相应的空白模板图像之间的对比特征来检测标记。在不需要额外的数据收集的情况下，通过结合这种新颖的网络架构，我们基于深度学习模型的制表方法不仅获得了更高的准确率，而且大大降低了总体的假阴性率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE 24th International Conference on Information Reuse and Integration for Data Science (IRI)

自引率

0.00%

发文量