Proceedings of the 21st ACM Symposium on Document Engineering最新文献

筛选
英文 中文
Towards extraction of theorems and proofs in scholarly articles 对学术文章中定理和证明的提取
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3475059
Shrey Mishra, Lucas Pluvinage, P. Senellart
{"title":"Towards extraction of theorems and proofs in scholarly articles","authors":"Shrey Mishra, Lucas Pluvinage, P. Senellart","doi":"10.1145/3469096.3475059","DOIUrl":"https://doi.org/10.1145/3469096.3475059","url":null,"abstract":"Scholarly articles in mathematical fields often feature mathematical statements (theorems, propositions, etc.) and their proofs. In this paper, we present preliminary work for extracting such information from PDF documents, with several types of approaches: vision (using YOLO), natural language (with transformers), and styling information (with linear conditional random fields). Our main task is to identify which parts of the paper to label as theorem-like environments and proofs. We rely on a dataset collected from arXiv, with LATeX sources of research articles used to train the models.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117132955","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Counterfeit detection with QR codes QR码防伪
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3474924
J. Picard, Paul Landry, Michael Bolay
{"title":"Counterfeit detection with QR codes","authors":"J. Picard, Paul Landry, Michael Bolay","doi":"10.1145/3469096.3474924","DOIUrl":"https://doi.org/10.1145/3469096.3474924","url":null,"abstract":"Serialized QR Codes applied to product packaging have received considerable interest as a potential solution to the problem of industrial counterfeiting. Compared to traditional security solutions (e.g. taggants, holograms, security inks), they are indeed simpler to integrate in existing production workflow, easier to verify, and more cost-effective at scale. In addition, they allow to convey a product digital identity and history, and are used to connect brand owners with consumers. However, by itself a QR Code offers no protection at all against copy by cloning. Various schemes have been proposed to add a copy-sensitive layer to QR Codes, but most techniques in the state of the art have not been applied to real production environments, where QR Codes are printed at mass scale and products are scanned in a non-controlled environment, typically by consumers with their smartphones. This paper presents a system based on integrating copy detection patterns into QR Codes, which has been deployed for a number of years on the market.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125425297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Session details: Tutorials 会议详情:教程
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3482779
R. Lins
{"title":"Session details: Tutorials","authors":"R. Lins","doi":"10.1145/3482779","DOIUrl":"https://doi.org/10.1145/3482779","url":null,"abstract":"","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"702 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121994984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Document content analysis 会话详情:文档内容分析
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3482781
Besat Kassaie
{"title":"Session details: Document content analysis","authors":"Besat Kassaie","doi":"10.1145/3482781","DOIUrl":"https://doi.org/10.1145/3482781","url":null,"abstract":"","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"82 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123403397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Keynote I 会议详情
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3347320.3368938
P. Healy
{"title":"Session details: Keynote I","authors":"P. Healy","doi":"10.1145/3347320.3368938","DOIUrl":"https://doi.org/10.1145/3347320.3368938","url":null,"abstract":"","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"91 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128313181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Direct binarization a quality-and-time efficient binarization strategy 直接二值化是一种质量和时间效率高的二值化策略
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3474932
R. Lins, R. Bernardino, Ricardo da Silva Barboza, Zanoni Dueire Lins
{"title":"Direct binarization a quality-and-time efficient binarization strategy","authors":"R. Lins, R. Bernardino, Ricardo da Silva Barboza, Zanoni Dueire Lins","doi":"10.1145/3469096.3474932","DOIUrl":"https://doi.org/10.1145/3469096.3474932","url":null,"abstract":"Most of the best known binarization algorithms have grayscale conversion as a pre-processing step, before applying the binarization strategy itself. Many algorithms produce equally good or even better quality images if fed with only one component of the image, instead of its gray-scale/luminance equivalent. The time-gain here is obtained in avoiding the several floating-point calculations in converting a RGB-color image into grayscale. More than 60 binarization algorithms were tested using \"real-world\" images.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Binarisation of photographed documents image quality and processing time assessment 拍摄文件的二值化、图像质量和处理时间评估
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3470833
R. Lins, S. Simske, R. Bernardino
{"title":"Binarisation of photographed documents image quality and processing time assessment","authors":"R. Lins, S. Simske, R. Bernardino","doi":"10.1145/3469096.3470833","DOIUrl":"https://doi.org/10.1145/3469096.3470833","url":null,"abstract":"Smartphones with cameras are omnipresent in today's world and are very often used to photograph documents. Document binarization is a key process in many document processing platforms. This competition on binarizing photographed documents assessed the quality and time performance of 13 new algorithms and 50 existing algorithms. The evaluation dataset is composed of offset, laser, and deskjet printed documents, photographed using four widely-used mobile devices with the strobe flash on and off, under two different angles and places of capture.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"95 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134483251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Engineering of an artificial intelligence safety data sheet document processing system for environmental, health, and safety compliance 环境、健康和安全合规人工智能安全数据表文档处理系统的工程设计
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3474933
Kevin Fenton, S. Simske
{"title":"Engineering of an artificial intelligence safety data sheet document processing system for environmental, health, and safety compliance","authors":"Kevin Fenton, S. Simske","doi":"10.1145/3469096.3474933","DOIUrl":"https://doi.org/10.1145/3469096.3474933","url":null,"abstract":"Chemical Safety Data Sheets (SDS) are the primary method by which chemical manufacturers communicate the ingredients and hazards of their products to the public. These SDSs are used for a wide variety of purposes ranging from environmental calculations to occupational health assessments to emergency response measures. Although a few companies have provided direct digital data transfer platforms using xml or equivalent schemata, the vast majority of chemical ingredient and hazard communication to product users still occurs through the use of millions of PDF documents that are largely loaded through manual data entry into downstream user databases. This research focuses on the reverse engineering of SDS document types to adapt to various layouts and the harnessing of meta-algorithmic and neural network approaches to provide a means of moving industrial institutions towards a digital universal SDS processing methodology. The complexities of SDS documents including the lack of format standardization, text and image combinations, and multi-lingual translation needs, combined, limit the accuracy and precision of optical character recognition tools. The approach in this document is to translate entire SDSs from thousands of chemical vendors, each with distinct formatting, to machine-encoded text with a high degree of accuracy and precision. Then the system will \"read\" and assess these documents as a human would; that is, ensuring that the documents are compliant, determining whether chemical formulations have changed, ensuring reported values are within expected thresholds, and comparing them to similar products for more environmentally friendly alternatives.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124070109","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Metadata-driven eye tracking for real-time applications 实时应用的元数据驱动眼动追踪
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3474935
Yasith Jayawardana, Gavindya Jayawardena, A. Duchowski, S. Jayarathna
{"title":"Metadata-driven eye tracking for real-time applications","authors":"Yasith Jayawardana, Gavindya Jayawardena, A. Duchowski, S. Jayarathna","doi":"10.1145/3469096.3474935","DOIUrl":"https://doi.org/10.1145/3469096.3474935","url":null,"abstract":"When conducting eye tracking studies, having a mechanism to collect data, build workflows, and validate results in a FAIR (i.e., findable, accessible, interoperable, and reusable) manner, facilitates automation. Given the vast landscape of vendor-specific eye tracking software, adopting FAIR metadata standards for the eye tracking domain is one step towards this. In this paper, we propose an approach to simplify the creation, execution, and validation of eye tracking studies through metadata. Using a metadata format that we developed, we first describe two eye trackers, and two datasets collected using them. Next, we use this metadata to simulate real-time data collection by replaying each dataset. From this replayed data, we analyze eye movements in real-time, and synthesize eye movement data from analytics in real-time. Based on our results, we discuss the utility of metadata in real-time eye tracking studies, and how this idea can be generalized into other applications.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123695801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Challenges in chart image classification: a comparative study of different deep learning methods 图表图像分类的挑战:不同深度学习方法的比较研究
Proceedings of the 21st ACM Symposium on Document Engineering Pub Date : 2021-08-16 DOI: 10.1145/3469096.3474931
Jennil Thiyam, Sanasam Ranbir Singh, P. Bora
{"title":"Challenges in chart image classification: a comparative study of different deep learning methods","authors":"Jennil Thiyam, Sanasam Ranbir Singh, P. Bora","doi":"10.1145/3469096.3474931","DOIUrl":"https://doi.org/10.1145/3469096.3474931","url":null,"abstract":"Charts are commonly used forms of visualizing scientific observations from research findings or commercial trends. They provide an abstraction of the underlying information in a more understandable way. Over time, different forms of charts are developed. With the increase in the number of scientific documents present on the internet with different types of charts, automatic chart classification is becoming an important task for various applications. There have been several studies on chart classification with methods ranging from traditional machine learning approaches like SVM, KNN, and HMM to recent deep learning models like VGG, ResNet, and Xception. However, inconsistencies in experimental results are evident. This paper evaluates nine of the recently proposed deep learning-based models on three datasets (one curated and annotated by authors, and two publicly available), and systematically studies their performances over various setups to understand the reason for observing inconsistent results.","PeriodicalId":423462,"journal":{"name":"Proceedings of the 21st ACM Symposium on Document Engineering","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116486498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信