{"title":"使用数据流形上的规范化流进行分布外检测","authors":"Seyedeh Fatemeh Razavi, Mohammadmahdi Mehmanchi, Reshad Hosseini, Mostafa Tavassolipour","doi":"10.1007/s10489-025-06499-x","DOIUrl":null,"url":null,"abstract":"<div><p>Using the intuition that out-of-distribution data have lower likelihoods, a common approach for out-of-distribution detection involves estimating the underlying data distribution. Normalizing flows are likelihood-based generative models providing a tractable density estimation via dimension-preserving invertible transformations. Conventional normalizing flows are prone to fail in out-of-distribution detection, because of the well-known curse of dimensionality problem of the likelihood-based models. To solve the problem of likelihood-based models, some works try to modify likelihood for example by incorporating a data complexity measure. We observed that these modifications are still insufficient. According to the manifold hypothesis, real-world data often lie on a low-dimensional manifold. Therefore, we proceed by estimating the density on a low-dimensional manifold and calculating a distance from the manifold as a measure for out-of-distribution detection. We propose a powerful criterion that combines this measure with the modified likelihood measure based on data complexity. Extensive experimental results show that incorporating manifold learning while accounting for the estimation of data complexity improves the out-of-distribution detection ability of normalizing flows. This improvement is achieved without modifying the model structure or using auxiliary out-of-distribution data during training.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 7","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Out-of-distribution detection using normalizing flows on the data manifold\",\"authors\":\"Seyedeh Fatemeh Razavi, Mohammadmahdi Mehmanchi, Reshad Hosseini, Mostafa Tavassolipour\",\"doi\":\"10.1007/s10489-025-06499-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Using the intuition that out-of-distribution data have lower likelihoods, a common approach for out-of-distribution detection involves estimating the underlying data distribution. Normalizing flows are likelihood-based generative models providing a tractable density estimation via dimension-preserving invertible transformations. Conventional normalizing flows are prone to fail in out-of-distribution detection, because of the well-known curse of dimensionality problem of the likelihood-based models. To solve the problem of likelihood-based models, some works try to modify likelihood for example by incorporating a data complexity measure. We observed that these modifications are still insufficient. According to the manifold hypothesis, real-world data often lie on a low-dimensional manifold. Therefore, we proceed by estimating the density on a low-dimensional manifold and calculating a distance from the manifold as a measure for out-of-distribution detection. We propose a powerful criterion that combines this measure with the modified likelihood measure based on data complexity. Extensive experimental results show that incorporating manifold learning while accounting for the estimation of data complexity improves the out-of-distribution detection ability of normalizing flows. This improvement is achieved without modifying the model structure or using auxiliary out-of-distribution data during training.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 7\",\"pages\":\"\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-04-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-025-06499-x\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-025-06499-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
摘要
根据非分布数据具有较低可能性的直觉,非分布检测的一种常见方法是对基础数据分布进行估计。归一化流是一种基于似然的生成模型,通过维度保留的可逆变换,提供可操作的密度估计。由于基于似然的模型存在众所周知的 "维度诅咒"(curse of dimensionality)问题,因此传统的归一化流程很容易在分布外检测中失败。为了解决基于似然模型的问题,一些研究尝试对似然进行修改,例如加入数据复杂性度量。我们发现,这些修改仍然不够。根据流形假说,现实世界的数据往往位于低维流形上。因此,我们通过估算低维流形上的密度,并计算与流形的距离,以此作为失配检测的衡量标准。我们提出了一种强大的标准,将这种测量方法与基于数据复杂性的修正似然测量方法相结合。广泛的实验结果表明,在考虑数据复杂性估计的同时结合流形学习,可以提高归一化流量的失配检测能力。这种改进是在不修改模型结构或在训练过程中不使用辅助分布外数据的情况下实现的。
Out-of-distribution detection using normalizing flows on the data manifold
Using the intuition that out-of-distribution data have lower likelihoods, a common approach for out-of-distribution detection involves estimating the underlying data distribution. Normalizing flows are likelihood-based generative models providing a tractable density estimation via dimension-preserving invertible transformations. Conventional normalizing flows are prone to fail in out-of-distribution detection, because of the well-known curse of dimensionality problem of the likelihood-based models. To solve the problem of likelihood-based models, some works try to modify likelihood for example by incorporating a data complexity measure. We observed that these modifications are still insufficient. According to the manifold hypothesis, real-world data often lie on a low-dimensional manifold. Therefore, we proceed by estimating the density on a low-dimensional manifold and calculating a distance from the manifold as a measure for out-of-distribution detection. We propose a powerful criterion that combines this measure with the modified likelihood measure based on data complexity. Extensive experimental results show that incorporating manifold learning while accounting for the estimation of data complexity improves the out-of-distribution detection ability of normalizing flows. This improvement is achieved without modifying the model structure or using auxiliary out-of-distribution data during training.
期刊介绍:
With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance.
The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.