Optimized Deep Isolation Forest

IF 3.3 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Łukasz Gałka
{"title":"Optimized Deep Isolation Forest","authors":"Łukasz Gałka","doi":"10.1016/j.patrec.2025.07.014","DOIUrl":null,"url":null,"abstract":"<div><div>Anomaly detection and the identification of elements that do not fit the data characteristics are increasingly used in information systems, both for data cleaning and for finding unusual elements. Unsupervised anomaly detection methods are particularly useful in this context. This paper introduces the Optimized Deep Isolation Forest (ODIF) as an optimized version of the Deep Isolation Forest (DIF) algorithm. The training of DIF is subjected to an optimization of the operations performed, which leads to a reduction of the computational and memory complexity. In a series of experiments, both DIF and ODIF are implemented, and their effectiveness is evaluated using Area Under the Precision-Recall Curve (PR AUC). The proposed method demonstrates significantly better detection performance compared to the baseline Isolation Forest and competitive techniques. Additionally, the execution times of the training phase are measured for both the CPU and GPU stages, as well as memory usage, including RAM and VRAM. The results unequivocally indicate a much faster execution of the ODIF algorithm compared to DIF, with average CPU stage and GPU stage times being over one and a half times and nearly 150 times shorter, respectively. Similarly, memory usage is significantly reduced for ODIF in comparison to DIF, with RAM consumption lowered by approximately 18% and VRAM by over 55%.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 88-94"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525002661","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Anomaly detection and the identification of elements that do not fit the data characteristics are increasingly used in information systems, both for data cleaning and for finding unusual elements. Unsupervised anomaly detection methods are particularly useful in this context. This paper introduces the Optimized Deep Isolation Forest (ODIF) as an optimized version of the Deep Isolation Forest (DIF) algorithm. The training of DIF is subjected to an optimization of the operations performed, which leads to a reduction of the computational and memory complexity. In a series of experiments, both DIF and ODIF are implemented, and their effectiveness is evaluated using Area Under the Precision-Recall Curve (PR AUC). The proposed method demonstrates significantly better detection performance compared to the baseline Isolation Forest and competitive techniques. Additionally, the execution times of the training phase are measured for both the CPU and GPU stages, as well as memory usage, including RAM and VRAM. The results unequivocally indicate a much faster execution of the ODIF algorithm compared to DIF, with average CPU stage and GPU stage times being over one and a half times and nearly 150 times shorter, respectively. Similarly, memory usage is significantly reduced for ODIF in comparison to DIF, with RAM consumption lowered by approximately 18% and VRAM by over 55%.

Abstract Image

优化的深度隔离森林
异常检测和识别不符合数据特征的元素在信息系统中越来越多地用于数据清洗和发现异常元素。在这种情况下,无监督异常检测方法特别有用。本文介绍了优化深度隔离林(ODIF)算法作为深度隔离林(DIF)算法的优化版本。DIF的训练是对所执行的操作进行优化,从而降低了计算和存储复杂性。在一系列实验中,分别实现了DIF和ODIF,并利用Precision-Recall Curve下面积(Area Under the Precision-Recall Curve, PR AUC)评价了它们的有效性。与基线隔离森林和竞争技术相比,该方法具有更好的检测性能。此外,还测量了CPU和GPU阶段的训练阶段的执行时间,以及内存使用情况,包括RAM和VRAM。结果明确表明,与DIF相比,ODIF算法的执行速度要快得多,平均CPU阶段和GPU阶段时间分别缩短了1.5倍和近150倍。同样,与DIF相比,ODIF的内存使用显著减少,RAM消耗降低了大约18%,VRAM降低了55%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Pattern Recognition Letters
Pattern Recognition Letters 工程技术-计算机:人工智能
CiteScore
12.40
自引率
5.90%
发文量
287
审稿时长
9.1 months
期刊介绍: Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信