Optimized Deep Isolation Forest

IF 3.3 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Pattern Recognition Letters Pub Date : 2025-07-25 DOI:10.1016/j.patrec.2025.07.014

Łukasz Gałka

{"title":"Optimized Deep Isolation Forest","authors":"Łukasz Gałka","doi":"10.1016/j.patrec.2025.07.014","DOIUrl":null,"url":null,"abstract":"<div><div>Anomaly detection and the identification of elements that do not fit the data characteristics are increasingly used in information systems, both for data cleaning and for finding unusual elements. Unsupervised anomaly detection methods are particularly useful in this context. This paper introduces the Optimized Deep Isolation Forest (ODIF) as an optimized version of the Deep Isolation Forest (DIF) algorithm. The training of DIF is subjected to an optimization of the operations performed, which leads to a reduction of the computational and memory complexity. In a series of experiments, both DIF and ODIF are implemented, and their effectiveness is evaluated using Area Under the Precision-Recall Curve (PR AUC). The proposed method demonstrates significantly better detection performance compared to the baseline Isolation Forest and competitive techniques. Additionally, the execution times of the training phase are measured for both the CPU and GPU stages, as well as memory usage, including RAM and VRAM. The results unequivocally indicate a much faster execution of the ODIF algorithm compared to DIF, with average CPU stage and GPU stage times being over one and a half times and nearly 150 times shorter, respectively. Similarly, memory usage is significantly reduced for ODIF in comparison to DIF, with RAM consumption lowered by approximately 18% and VRAM by over 55%.</div></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"197 ","pages":"Pages 88-94"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pattern Recognition Letters","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167865525002661","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Anomaly detection and the identification of elements that do not fit the data characteristics are increasingly used in information systems, both for data cleaning and for finding unusual elements. Unsupervised anomaly detection methods are particularly useful in this context. This paper introduces the Optimized Deep Isolation Forest (ODIF) as an optimized version of the Deep Isolation Forest (DIF) algorithm. The training of DIF is subjected to an optimization of the operations performed, which leads to a reduction of the computational and memory complexity. In a series of experiments, both DIF and ODIF are implemented, and their effectiveness is evaluated using Area Under the Precision-Recall Curve (PR AUC). The proposed method demonstrates significantly better detection performance compared to the baseline Isolation Forest and competitive techniques. Additionally, the execution times of the training phase are measured for both the CPU and GPU stages, as well as memory usage, including RAM and VRAM. The results unequivocally indicate a much faster execution of the ODIF algorithm compared to DIF, with average CPU stage and GPU stage times being over one and a half times and nearly 150 times shorter, respectively. Similarly, memory usage is significantly reduced for ODIF in comparison to DIF, with RAM consumption lowered by approximately 18% and VRAM by over 55%.

Abstract Image

查看原文本刊更多论文

优化的深度隔离森林

异常检测和识别不符合数据特征的元素在信息系统中越来越多地用于数据清洗和发现异常元素。在这种情况下，无监督异常检测方法特别有用。本文介绍了优化深度隔离林（ODIF）算法作为深度隔离林（DIF）算法的优化版本。DIF的训练是对所执行的操作进行优化，从而降低了计算和存储复杂性。在一系列实验中，分别实现了DIF和ODIF，并利用Precision-Recall Curve下面积（Area Under the Precision-Recall Curve, PR AUC）评价了它们的有效性。与基线隔离森林和竞争技术相比，该方法具有更好的检测性能。此外，还测量了CPU和GPU阶段的训练阶段的执行时间，以及内存使用情况，包括RAM和VRAM。结果明确表明，与DIF相比，ODIF算法的执行速度要快得多，平均CPU阶段和GPU阶段时间分别缩短了1.5倍和近150倍。同样，与DIF相比，ODIF的内存使用显著减少，RAM消耗降低了大约18%，VRAM降低了55%以上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Pattern Recognition Letters 工程技术-计算机：人工智能

CiteScore

12.40

自引率

5.90%

发文量

287

审稿时长

9.1 months

期刊介绍： Pattern Recognition Letters aims at rapid publication of concise articles of a broad interest in pattern recognition. Subject areas include all the current fields of interest represented by the Technical Committees of the International Association of Pattern Recognition, and other developing themes involving learning and recognition.