Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer
{"title":"通过红外扫描从打印文档中提取手写注释","authors":"Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer","doi":"10.1145/3491101.3519872","DOIUrl":null,"url":null,"abstract":"Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.","PeriodicalId":123301,"journal":{"name":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","volume":"475 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning\",\"authors\":\"Andreas Schmid, Lorenz Heckelbacher, Raphael Wimmer\",\"doi\":\"10.1145/3491101.3519872\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.\",\"PeriodicalId\":123301,\"journal\":{\"name\":\"CHI Conference on Human Factors in Computing Systems Extended Abstracts\",\"volume\":\"475 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"CHI Conference on Human Factors in Computing Systems Extended Abstracts\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3491101.3519872\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"CHI Conference on Human Factors in Computing Systems Extended Abstracts","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3491101.3519872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Extracting Handwritten Annotations from Printed Documents Via Infrared Scanning
Despite ever improving digital ink and paper solutions, many people still prefer printing out documents for close reading, proofreading, or filling out forms. However, in order to incorporate paper-based annotations into digital workflows, handwritten text and markings need to be extracted. Common computer-vision and machine-learning approaches require extensive sets of training data or a clean digital version of the document. We propose a simple method for extracting handwritten annotations from laser-printed documents using multispectral imaging. While black toner absorbs infrared light, most inks are invisible in the infrared spectrum. We modified an off-the-shelf flatbed scanner by adding a switchable infrared LED to its light guide. By subtracting an infrared scan from a color scan, handwritten text and highlighting can be extracted and added to a PDF version. Initial experiments show accurate results with high quality on a test data set of 93 annotated pages. Thus, infrared scanning seems like a promising building block for integrating paper-based and digital annotation practices.