Ari Vesalainen , Mikko Tolonen , Laura Ruotsalainen
{"title":"Document Layout Error Rate (DLER) metric to evaluate image segmentation methods","authors":"Ari Vesalainen , Mikko Tolonen , Laura Ruotsalainen","doi":"10.1016/j.mlwa.2024.100606","DOIUrl":null,"url":null,"abstract":"<div><div>Scholarly editions play a crucial role in humanities research, particularly in the study of literature and historical documents. The primary objective of these editions is to reconstruct the original text or provide insights into the author’s intentions. Traditionally, crafting a critical edition required a lifetime of dedication. However, thanks to recent advancements in deep learning and computer vision, modern text recognition tools can now be used to expedite this process. A key part of these tools is document layout analysis (DLA), where image segmentation methods are used to detect different text elements. Most existing DLA solutions have focused on evaluating the accuracy of these methods, often neglecting to study the practical consequences of method selection. In this study, we have developed a new metric, the Document Layout Error Rate (DLER), which evaluates the performance of fine-grained DLA methods within the overall pipeline. This metric helps identify the method with the lowest error rate, thereby minimizing the manual effort required for corrections. We applied this evaluation method to assess four different methods and their efficacy for the DLA task in the context of David Hume’s <em>History of England</em>.</div></div>","PeriodicalId":74093,"journal":{"name":"Machine learning with applications","volume":"18 ","pages":"Article 100606"},"PeriodicalIF":0.0000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine learning with applications","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666827024000823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Scholarly editions play a crucial role in humanities research, particularly in the study of literature and historical documents. The primary objective of these editions is to reconstruct the original text or provide insights into the author’s intentions. Traditionally, crafting a critical edition required a lifetime of dedication. However, thanks to recent advancements in deep learning and computer vision, modern text recognition tools can now be used to expedite this process. A key part of these tools is document layout analysis (DLA), where image segmentation methods are used to detect different text elements. Most existing DLA solutions have focused on evaluating the accuracy of these methods, often neglecting to study the practical consequences of method selection. In this study, we have developed a new metric, the Document Layout Error Rate (DLER), which evaluates the performance of fine-grained DLA methods within the overall pipeline. This metric helps identify the method with the lowest error rate, thereby minimizing the manual effort required for corrections. We applied this evaluation method to assess four different methods and their efficacy for the DLA task in the context of David Hume’s History of England.
学术版本在人文学科研究,尤其是文学和历史文献研究中发挥着至关重要的作用。这些版本的主要目的是重构原文或深入了解作者的意图。传统上,制作批判性版本需要一生的奉献。然而,得益于深度学习和计算机视觉领域的最新进展,现在可以使用现代文本识别工具来加快这一过程。这些工具的一个关键部分是文档排版分析(DLA),其中使用图像分割方法来检测不同的文本元素。大多数现有的 DLA 解决方案都侧重于评估这些方法的准确性,往往忽视了对方法选择的实际影响的研究。在本研究中,我们开发了一种新指标--文档布局错误率(DLER),用于评估整个管道中细粒度 DLA 方法的性能。该指标有助于确定错误率最低的方法,从而最大限度地减少人工修正所需的工作量。我们在大卫-休谟(David Hume)的《英国史》(History of England)中应用了这种评估方法来评估四种不同的方法及其在 DLA 任务中的功效。