A file archival integrity check method based on the BiLSTM + CNN model and deep learning

IF 5 3区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Egyptian Informatics Journal Pub Date : 2025-01-09 DOI:10.1016/j.eij.2024.100597

Jinxun Li, Tingjun Wang, Chao Ma, Yunxuan Lin, Qing Yan

{"title":"A file archival integrity check method based on the BiLSTM + CNN model and deep learning","authors":"Jinxun Li, Tingjun Wang, Chao Ma, Yunxuan Lin, Qing Yan","doi":"10.1016/j.eij.2024.100597","DOIUrl":null,"url":null,"abstract":"<div><div>Validating and integrity-checking archives ensures that files are authentic, trustworthy, and usable. In the age of digital technology, historical records must be genuine. Researching in archives raises ethical issues while having little to do with individuals. Traditional archive integrity solutions have scaling issues, real-time monitoring issues, and missed opportunities. An updated Archive File Integrity Check Method (AFICM) may solve these issues, and the paper explains it. Deep learning allows the combination of a Bidirectional Long-Short Term Memory (Bi-LSTM) with adaptive gating and an adaptive Temporal Convolutional Neural Network (TCNN) with multi-scale temporal attention. This method protects archived material against manipulation, which is crucial. The recommended method extracts complex sequential patterns and variants using adaptive TCNN trained on file data. Next, it analyzes these features using a Bi-LSTM network and attenuation method. It allows it to highlight significant temporal correlations while downplaying irrelevant data selectively. The hybrid model outperforms checksums in accuracy and dependability. It uses adaptive TCNNs for time-related feature extraction and attenuated Bi-LSTM for refinement. The F1 score, recall, accuracy, precision, and AU-ROC are critical measures for model evaluation. The AICM performed well overall, with 97.32% precision and 98.95% accuracy. This integrity check method outperforms others with an F1 score of 97.58, an AU-ROC of 0.983, and a recall rate of 98.18%. The findings set a new standard for archiving system integrity testing by showing the model’s dependability and security in several use scenarios.</div></div>","PeriodicalId":56010,"journal":{"name":"Egyptian Informatics Journal","volume":"29 ","pages":"Article 100597"},"PeriodicalIF":5.0000,"publicationDate":"2025-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Egyptian Informatics Journal","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110866524001609","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Validating and integrity-checking archives ensures that files are authentic, trustworthy, and usable. In the age of digital technology, historical records must be genuine. Researching in archives raises ethical issues while having little to do with individuals. Traditional archive integrity solutions have scaling issues, real-time monitoring issues, and missed opportunities. An updated Archive File Integrity Check Method (AFICM) may solve these issues, and the paper explains it. Deep learning allows the combination of a Bidirectional Long-Short Term Memory (Bi-LSTM) with adaptive gating and an adaptive Temporal Convolutional Neural Network (TCNN) with multi-scale temporal attention. This method protects archived material against manipulation, which is crucial. The recommended method extracts complex sequential patterns and variants using adaptive TCNN trained on file data. Next, it analyzes these features using a Bi-LSTM network and attenuation method. It allows it to highlight significant temporal correlations while downplaying irrelevant data selectively. The hybrid model outperforms checksums in accuracy and dependability. It uses adaptive TCNNs for time-related feature extraction and attenuated Bi-LSTM for refinement. The F1 score, recall, accuracy, precision, and AU-ROC are critical measures for model evaluation. The AICM performed well overall, with 97.32% precision and 98.95% accuracy. This integrity check method outperforms others with an F1 score of 97.58, an AU-ROC of 0.983, and a recall rate of 98.18%. The findings set a new standard for archiving system integrity testing by showing the model’s dependability and security in several use scenarios.

查看原文本刊更多论文

一种基于BiLSTM + CNN模型和深度学习的档案完整性检测方法

验证和完整性检查档案确保文件是真实的、值得信赖的和可用的。在数字技术时代，历史记录必须是真实的。档案研究引发了伦理问题，但与个人关系不大。传统的归档完整性解决方案存在可伸缩性问题、实时监控问题和错失的机会。一种更新的存档文件完整性检查方法（AFICM）可以解决这些问题，本文对此进行了说明。深度学习允许结合具有自适应门控的双向长短期记忆（Bi-LSTM）和具有多尺度时间注意的自适应时间卷积神经网络（TCNN）。这种方法保护存档材料不受操纵，这一点至关重要。推荐的方法使用在文件数据上训练的自适应TCNN提取复杂的序列模式和变体。然后，使用Bi-LSTM网络和衰减方法分析了这些特征。它可以突出重要的时间相关性，同时选择性地淡化不相关的数据。混合模型在准确性和可靠性方面优于校验和。它使用自适应tcnn进行时间相关特征提取，并使用衰减的Bi-LSTM进行细化。F1评分、召回率、准确性、精密度和AU-ROC是模型评价的关键指标。AICM总体表现良好，精密度为97.32%，准确度为98.95%。该完整性检查方法的F1得分为97.58，AU-ROC为0.983，召回率为98.18%，优于其他方法。这些发现通过展示模型在几个使用场景中的可靠性和安全性，为归档系统完整性测试设定了一个新标准。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Egyptian Informatics Journal Decision Sciences-Management Science and Operations Research

CiteScore

11.10

自引率

1.90%

发文量

审稿时长

110 days

期刊介绍： The Egyptian Informatics Journal is published by the Faculty of Computers and Artificial Intelligence, Cairo University. This Journal provides a forum for the state-of-the-art research and development in the fields of computing, including computer sciences, information technologies, information systems, operations research and decision support. Innovative and not-previously-published work in subjects covered by the Journal is encouraged to be submitted, whether from academic, research or commercial sources.