通过红外扫描从打印文档中提取手写注释

Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer
{"title":"通过红外扫描从打印文档中提取手写注释","authors":"Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer","doi":"10.1145/3491101.3519872","DOIUrl":null,"url":null,"abstract":"Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.","PeriodicalId":123301,"journal":{"name":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","volume":"475 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning\",\"authors\":\"Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer\",\"doi\":\"10.1145/3491101.3519872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.\",\"PeriodicalId\":123301,\"journal\":{\"name\":\"CHI Conference on Human Factors in Computing Systems Extended Abstracts\",\"volume\":\"475 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CHI Conference on Human Factors in Computing Systems Extended Abstracts\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3491101.3519872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491101.3519872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

尽管数字墨水和纸张解决方案不断改进,但许多人仍然喜欢打印文件,以便仔细阅读、校对或填写表格。然而,为了将基于纸张的注释合并到数字工作流程中,需要提取手写文本和标记。常见的计算机视觉和机器学习方法需要大量的训练数据集或文档的清晰数字版本。提出了一种利用多光谱成像从激光打印文档中提取手写注释的简单方法。虽然黑色墨粉吸收红外光,但大多数油墨在红外光谱中是不可见的。我们改进了一个现成的平板扫描仪,在它的光导上增加了一个可切换的红外LED。通过从彩色扫描中减去红外扫描,可以提取手写文本和高亮显示并添加到PDF版本中。在93页的测试数据集上进行了初步实验,结果准确,质量高。因此,红外扫描似乎是整合纸质和数字注释实践的一个很有前途的构建块。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning
Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信