{"title":"高频市场中的深度无监督异常检测","authors":"Cédric Poutré , Didier Chételat , Manuel Morales","doi":"10.1016/j.jfds.2024.100129","DOIUrl":null,"url":null,"abstract":"<div><p>Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade-based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.</p></div>","PeriodicalId":36340,"journal":{"name":"Journal of Finance and Data Science","volume":"10 ","pages":"Article 100129"},"PeriodicalIF":0.0000,"publicationDate":"2024-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S240591882400014X/pdfft?md5=438a8d321f83d33f31a8f9a74e366e8e&pid=1-s2.0-S240591882400014X-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Deep unsupervised anomaly detection in high-frequency markets\",\"authors\":\"Cédric Poutré , Didier Chételat , Manuel Morales\",\"doi\":\"10.1016/j.jfds.2024.100129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade-based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.</p></div>\",\"PeriodicalId\":36340,\"journal\":{\"name\":\"Journal of Finance and Data Science\",\"volume\":\"10 \",\"pages\":\"Article 100129\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S240591882400014X/pdfft?md5=438a8d321f83d33f31a8f9a74e366e8e&pid=1-s2.0-S240591882400014X-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Finance and Data Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S240591882400014X\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Mathematics\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Finance and Data Science","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S240591882400014X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
Deep unsupervised anomaly detection in high-frequency markets
Inspired by recent advances in the deep learning literature, this article introduces a novel hybrid anomaly detection framework specifically designed for limit order book (LOB) data. A modified Transformer autoencoder architecture is proposed to learn rich temporal LOB subsequence representations, which eases the separability of normal and fraudulent time series. A dissimilarity function is then learned in the representation space to characterize normal LOB behavior, enabling the detection of any anomalous subsequences out-of-sample. We also develop a complete trade-based manipulation simulation methodology able to generate a variety of scenarios derived from actual trade–based fraud cases. The complete framework is tested on LOB data of five NASDAQ stocks in which we randomly insert synthetic quote stuffing, layering, and pump-and-dump manipulations. We show that the proposed asset-independent approach achieves new state-of-the-art fraud detection performance, without requiring any prior knowledge of manipulation patterns.