Establishing Genealogies of Born Digital Content: The Suitability of Revision Identifier (RSID) Numbers in MS Word for Forensic Enquiry

IF 4.6 Q1 INFORMATION SCIENCE & LIBRARY SCIENCE
D. Spennemann, Rudolf J. Spennemann
{"title":"Establishing Genealogies of Born Digital Content: The Suitability of Revision Identifier (RSID) Numbers in MS Word for Forensic Enquiry","authors":"D. Spennemann, Rudolf J. Spennemann","doi":"10.3390/publications11030035","DOIUrl":null,"url":null,"abstract":"Born-digital content is rapidly becoming the norm for literary works, professional reports, academic journal articles, and formal corporate correspondence. From the perspective of digital forensics, there is a need to understand the origin of a document and its entire creation process, from outlining and drafting to editing the final version of the text. Revision save identifier (RSID) numbers embedded in MS Word documents have been used to examine the nature and extent of individual edits within a document. These RSIDs remain logged in the metadata even if the text with which they were associated has been removed. As copies of such files retain the original’s RSIDs, this metadata can be used to determine the order in which documents were cloned from each other. As a proof-of-concept, this paper examined over 400 template files generated by a single publisher for manuscript submissions to its journals. The study can show that it is possible to establish genealogies and thus relative chronologies of born digital content by first identifying those documents that share a document (root) RSID and then seriating those RSIDs that are shared between two or more documents.","PeriodicalId":37551,"journal":{"name":"Publications","volume":" ","pages":""},"PeriodicalIF":4.6000,"publicationDate":"2023-06-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Publications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/publications11030035","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"INFORMATION SCIENCE & LIBRARY SCIENCE","Score":null,"Total":0}
引用次数: 1

Abstract

Born-digital content is rapidly becoming the norm for literary works, professional reports, academic journal articles, and formal corporate correspondence. From the perspective of digital forensics, there is a need to understand the origin of a document and its entire creation process, from outlining and drafting to editing the final version of the text. Revision save identifier (RSID) numbers embedded in MS Word documents have been used to examine the nature and extent of individual edits within a document. These RSIDs remain logged in the metadata even if the text with which they were associated has been removed. As copies of such files retain the original’s RSIDs, this metadata can be used to determine the order in which documents were cloned from each other. As a proof-of-concept, this paper examined over 400 template files generated by a single publisher for manuscript submissions to its journals. The study can show that it is possible to establish genealogies and thus relative chronologies of born digital content by first identifying those documents that share a document (root) RSID and then seriating those RSIDs that are shared between two or more documents.
建立天生数字内容的谱系:MS Word中修订标识符(RSID)数字在法医学查询中的适用性
天生的数字内容正在迅速成为文学作品、专业报告、学术期刊文章和正式企业信件的规范。从数字取证的角度来看,有必要了解文件的起源及其整个创建过程,从概述和起草到编辑文本的最终版本。嵌入MS Word文档中的修订保存标识符(RSID)编号已用于检查文档中单个编辑的性质和范围。即使与这些RSID关联的文本已被删除,这些RSID仍会记录在元数据中。由于此类文件的副本保留了原始文件的RSID,因此可以使用此元数据来确定文档相互克隆的顺序。作为概念验证,本文检查了一家出版商为向其期刊提交稿件而生成的400多个模板文件。该研究可以表明,通过首先识别共享文档(根)RSID的文档,然后对两个或多个文档之间共享的RSID进行序列化,可以建立出生数字内容的家谱,从而建立相对年表。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Publications
Publications Social Sciences-Library and Information Sciences
CiteScore
6.50
自引率
1.90%
发文量
40
审稿时长
11 weeks
期刊介绍: The scope of Publications includes: Theory and practice of scholarly communication Digitisation and innovations in scholarly publishing technologies Metadata, infrastructure, and linking the scholarly record Publishing policies and editorial/peer-review workflows Financial models for scholarly publishing Copyright, licensing and legal issues in publishing Research integrity and publication ethics Issues and best practices in the publication of non-traditional research outputs (e.g., data, software/code, protocols, data management plans, grant proposals, etc.) Issues in the transition to open access and open science Inclusion and participation of traditionally excluded actors Language issues in publication processes and products Traditional and alternative models of peer review Traditional and alternative means of assessment and evaluation of research and its impact, including bibliometrics and scientometrics The place of research libraries, scholarly societies, funders and others in scholarly communication.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信