{"title":"Predicting financial fraud in Chinese listed companies: An enterprise portrait and machine learning approach","authors":"Zejun Zhang, Zhao Wang, Lixin Cai","doi":"10.1016/j.pacfin.2025.102665","DOIUrl":null,"url":null,"abstract":"<div><div>Financial fraud of listed companies is a frequent problem in the capital market. Due to factors such as information asymmetry and inadequate regulation, financial fraud severely restricts stakeholders' capital allocation behavior and hinders the sustainable development of the capital market. However, existing research lacks systematic and quantitative insights into the characteristics of firms involved in financial fraud, making it difficult to achieve quantitative identification of most such firms. This limitation arises from a predominant focus on the causal relationships between various financial indicators and financial fraud. In this paper, we integrate machine learning and enterprise portrait methods, using listed companies in the Chinese capital market as research subjects to predict corporate financial fraud. Firstly, a comprehensive system of indicators is established, covering seven dimensions: basic corporate information, profitability, solvency, operating efficiency, capital structure, corporate governance, and emotional attitude. Subsequently, the feature visualization portrait is created using Gaussian mixture model (GMM) clustering and label classification, while the predictive role of multidimensional enterprise portrait features in assessing the risk of corporate financial fraud is examined. The results indicate that unstructured indicators, such as Management Discussion and Analysis (MD&A), can significantly enhance predictive capability for corporate financial fraud. The SHapley Additive exPlanations (SHAP) method is introduced to reveal the influencing factors and characteristics of financial fraud. The empirical findings show that firms involved in financial fraud typically exhibit characteristics such as shorter listing times, weaker solvency and operating efficiency, higher capital structure, and poor corporate governance ability. Moreover, the XGBoost model demonstrates superior predictive performance among various models. The findings of this study provide a new perspective for in-depth exploration of the impact mechanisms of financial fraud and related regulatory warnings. These findings contribute to enhancing the effectiveness of governance and the capital allocation function within the capital market.</div></div>","PeriodicalId":48074,"journal":{"name":"Pacific-Basin Finance Journal","volume":"90 ","pages":"Article 102665"},"PeriodicalIF":4.8000,"publicationDate":"2025-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Pacific-Basin Finance Journal","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0927538X25000022","RegionNum":2,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0
Abstract
Financial fraud of listed companies is a frequent problem in the capital market. Due to factors such as information asymmetry and inadequate regulation, financial fraud severely restricts stakeholders' capital allocation behavior and hinders the sustainable development of the capital market. However, existing research lacks systematic and quantitative insights into the characteristics of firms involved in financial fraud, making it difficult to achieve quantitative identification of most such firms. This limitation arises from a predominant focus on the causal relationships between various financial indicators and financial fraud. In this paper, we integrate machine learning and enterprise portrait methods, using listed companies in the Chinese capital market as research subjects to predict corporate financial fraud. Firstly, a comprehensive system of indicators is established, covering seven dimensions: basic corporate information, profitability, solvency, operating efficiency, capital structure, corporate governance, and emotional attitude. Subsequently, the feature visualization portrait is created using Gaussian mixture model (GMM) clustering and label classification, while the predictive role of multidimensional enterprise portrait features in assessing the risk of corporate financial fraud is examined. The results indicate that unstructured indicators, such as Management Discussion and Analysis (MD&A), can significantly enhance predictive capability for corporate financial fraud. The SHapley Additive exPlanations (SHAP) method is introduced to reveal the influencing factors and characteristics of financial fraud. The empirical findings show that firms involved in financial fraud typically exhibit characteristics such as shorter listing times, weaker solvency and operating efficiency, higher capital structure, and poor corporate governance ability. Moreover, the XGBoost model demonstrates superior predictive performance among various models. The findings of this study provide a new perspective for in-depth exploration of the impact mechanisms of financial fraud and related regulatory warnings. These findings contribute to enhancing the effectiveness of governance and the capital allocation function within the capital market.
期刊介绍:
The Pacific-Basin Finance Journal is aimed at providing a specialized forum for the publication of academic research on capital markets of the Asia-Pacific countries. Primary emphasis will be placed on the highest quality empirical and theoretical research in the following areas: • Market Micro-structure; • Investment and Portfolio Management; • Theories of Market Equilibrium; • Valuation of Financial and Real Assets; • Behavior of Asset Prices in Financial Sectors; • Normative Theory of Financial Management; • Capital Markets of Development; • Market Mechanisms.