Story Beyond the Eye: Glyph Positions Break PDF Text Redaction

Maxwell Bland, Anushya Iyer, Kirill Levchenko
{"title":"Story Beyond the Eye: Glyph Positions Break PDF Text Redaction","authors":"Maxwell Bland, Anushya Iyer, Kirill Levchenko","doi":"10.56553/popets-2023-0069","DOIUrl":null,"url":null,"abstract":"In this work we find that many current redactions of PDF text are insecure due to non-redacted character positioning information. In particular, subpixel-sized horizontal shifts in redacted and non-redacted characters can be recovered and used to effectively deredact first and last names. Unfortunately these findings affect redactions where the text underneath the black box is removed from the PDF. We demonstrate these findings by performing a comprehensive vulnerability assessment of common PDF redaction types. We examine 11 popular PDF redaction tools, including Adobe Acrobat, and find that they leak information about redacted text. We also effectively deredact hundreds of real-world PDF redactions, including those found in OIG investigation reports and FOIA responses. To correct the problem, we have released open source algorithms to fix vulnerable redactions and reduce the amount of information leaked by nonexcising redactions (where the text underneath the redaction is copy-pastable). We have also notified the developers of the studied redaction tools. We have notified the Office of Inspector General, the Free Law Project, PACER, Adobe, Microsoft, and the US Department of Justice. We are working with several of these groups to prevent our discoveries from being used for malicious purposes.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56553/popets-2023-0069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In this work we find that many current redactions of PDF text are insecure due to non-redacted character positioning information. In particular, subpixel-sized horizontal shifts in redacted and non-redacted characters can be recovered and used to effectively deredact first and last names. Unfortunately these findings affect redactions where the text underneath the black box is removed from the PDF. We demonstrate these findings by performing a comprehensive vulnerability assessment of common PDF redaction types. We examine 11 popular PDF redaction tools, including Adobe Acrobat, and find that they leak information about redacted text. We also effectively deredact hundreds of real-world PDF redactions, including those found in OIG investigation reports and FOIA responses. To correct the problem, we have released open source algorithms to fix vulnerable redactions and reduce the amount of information leaked by nonexcising redactions (where the text underneath the redaction is copy-pastable). We have also notified the developers of the studied redaction tools. We have notified the Office of Inspector General, the Free Law Project, PACER, Adobe, Microsoft, and the US Department of Justice. We are working with several of these groups to prevent our discoveries from being used for malicious purposes.
超越眼睛的故事:字形位置打破PDF文本编校
在这项工作中,我们发现许多当前的PDF文本编校是不安全的,由于未编校字符定位信息。特别是,在编辑和未编辑的字符中,亚像素级大小的水平位移可以恢复并用于有效地删除名字和姓氏。不幸的是,这些发现影响了编辑,黑盒子下面的文本被从PDF中删除。我们通过对常见的PDF编校类型进行全面的漏洞评估来证明这些发现。我们检查了11种流行的PDF编校工具,包括Adobe Acrobat,并发现它们会泄露有关编校文本的信息。我们还有效地删除了数百份真实世界的PDF版本,包括OIG调查报告和《信息自由法》回应中发现的内容。为了纠正这个问题,我们发布了开源算法来修复易受攻击的编校,并减少非删节编校(其中编校下面的文本是可复制粘贴的)泄露的信息数量。我们还通知了所研究的编校工具的开发人员。我们已经通知了监察长办公室、自由法律项目、PACER、Adobe、微软和美国司法部。我们正在与其中几个组织合作,以防止我们的发现被用于恶意目的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信