会计数据违规的大规模检测

Sixth International Conference on Data Mining (ICDM'06) Pub Date : 2006-12-18 DOI:10.1109/ICDM.2006.93

Stephen D. Bay, K. Kumaraswamy, M. Anderle, Rohit Kumar, D. Steier

{"title":"会计数据违规的大规模检测","authors":"Stephen D. Bay, K. Kumaraswamy, M. Anderle, Rohit Kumar, D. Steier","doi":"10.1109/ICDM.2006.93","DOIUrl":null,"url":null,"abstract":"In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. This paper proposes techniques for automatic analysis of company general ledgers on such a large scale, identifying irregularities - which may indicate fraud or just honest errors - for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner. In this paper, we describe how we addressed these challenges over the past two years and report on experiments evaluating Sherlock.","PeriodicalId":356443,"journal":{"name":"Sixth International Conference on Data Mining (ICDM'06)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"89","resultStr":"{\"title\":\"Large Scale Detection of Irregularities in Accounting Data\",\"authors\":\"Stephen D. Bay, K. Kumaraswamy, M. Anderle, Rohit Kumar, D. Steier\",\"doi\":\"10.1109/ICDM.2006.93\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. This paper proposes techniques for automatic analysis of company general ledgers on such a large scale, identifying irregularities - which may indicate fraud or just honest errors - for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner. In this paper, we describe how we addressed these challenges over the past two years and report on experiments evaluating Sherlock.\",\"PeriodicalId\":356443,\"journal\":{\"name\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2006-12-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"89\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Sixth International Conference on Data Mining (ICDM'06)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDM.2006.93\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Data Mining (ICDM'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDM.2006.93","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 89

摘要

近年来，出现了几起大型会计欺诈事件，一家公司的财务业绩被故意歪曲了数十亿美元。作为回应，监管机构要求审计师对详细的财务数据进行分析，以发现此类错报。对于一家大型审计公司来说，这可能意味着要分析来自数千个客户的数百万条记录。本文提出了对如此大规模的公司总分类账进行自动分析的技术，识别违规行为——可能表明欺诈或只是诚实的错误——以供审计师进行额外审查。这些技术已经在一个名为Sherlock的原型系统中实现，该系统结合了离群值检测和分类的各个方面。在开发Sherlock的过程中，我们面临着三个主要挑战:开发一种从许多异构源获取数据的有效流程，仅使用正示例和未标记示例训练分类器，以及以易于解释的方式向审计员呈现信息。在本文中，我们描述了在过去两年中我们是如何应对这些挑战的，并报告了评估夏洛克的实验。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Large Scale Detection of Irregularities in Accounting Data

In recent years, there have been several large accounting frauds where a company's financial results have been intentionally misrepresented by billions of dollars. In response, regulatory bodies have mandated that auditors perform analytics on detailed financial data with the intent of discovering such misstatements. For a large auditing firm, this may mean analyzing millions of records from thousands of clients. This paper proposes techniques for automatic analysis of company general ledgers on such a large scale, identifying irregularities - which may indicate fraud or just honest errors - for additional review by auditors. These techniques have been implemented in a prototype system, called Sherlock, which combines aspects of both outlier detection and classification. In developing Sherlock, we faced three major challenges: developing an efficient process for obtaining data from many heterogeneous sources, training classifiers with only positive and unlabeled examples, and presenting information to auditors in an easily interpretable manner. In this paper, we describe how we addressed these challenges over the past two years and report on experiments evaluating Sherlock.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Sixth International Conference on Data Mining (ICDM'06)

自引率

0.00%

发文量