A Novel Example-Dependent Cost-Sensitive Stacking Classifier to Identify Tax Return Defaulters

IF 7.4 3区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Business & Information Systems Engineering Pub Date : 2021-07-02 DOI:10.52825/bis.v1i.61

Sanat Bhargava, Manish Kumar, P. Mehta, Jithin Mathews, K. S. Kumar, C. Babu

{"title":"A Novel Example-Dependent Cost-Sensitive Stacking Classifier to Identify Tax Return Defaulters","authors":"Sanat Bhargava, Manish Kumar, P. Mehta, Jithin Mathews, K. S. Kumar, C. Babu","doi":"10.52825/bis.v1i.61","DOIUrl":null,"url":null,"abstract":"Tax evasion refers to an entity indulging in illegal activities to avoid paying their actual tax liability. A tax return statement is a periodic report comprising information about income, expenditure, etc. One of the most basic tax evasion methods is failing to file tax returns or delay filing tax return statements. The taxpayers who do not file their returns, or fail to do so within the stipulated period are called tax return defaulters. As a result, the Government has to bear the financial losses due to a taxpayer defaulting, which varies for each taxpayer. Therefore, while designing any statistical model to predict potential return defaulters, we have to consider the real financial loss associated with the misclassification of each individual. This paper proposes a framework for an example-dependent cost-sensitive stacking classifier that uses cost-insensitive classifiers as base generalizers to make predictions on the input space. These predictions are used to train an example-dependent cost-sensitive meta generalizer. Based on the meta-generalizer choice, we propose four variant models used to predict potential return defaulters for the upcoming tax-filing period. These models have been developed for the Commercial Taxes Department, Government of Telangana, India. Applying our proposed variant models to GST data, we observe a significant increase in savings compared to conventional classifiers. Additionally, we develop an empirical study showing that our approach is more adept at identifying potential tax return defaulters than existing example-dependent cost-sensitive classification algorithms. \n ","PeriodicalId":56020,"journal":{"name":"Business & Information Systems Engineering","volume":"24 1","pages":"343-353"},"PeriodicalIF":7.4000,"publicationDate":"2021-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Business & Information Systems Engineering","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.52825/bis.v1i.61","RegionNum":3,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Tax evasion refers to an entity indulging in illegal activities to avoid paying their actual tax liability. A tax return statement is a periodic report comprising information about income, expenditure, etc. One of the most basic tax evasion methods is failing to file tax returns or delay filing tax return statements. The taxpayers who do not file their returns, or fail to do so within the stipulated period are called tax return defaulters. As a result, the Government has to bear the financial losses due to a taxpayer defaulting, which varies for each taxpayer. Therefore, while designing any statistical model to predict potential return defaulters, we have to consider the real financial loss associated with the misclassification of each individual. This paper proposes a framework for an example-dependent cost-sensitive stacking classifier that uses cost-insensitive classifiers as base generalizers to make predictions on the input space. These predictions are used to train an example-dependent cost-sensitive meta generalizer. Based on the meta-generalizer choice, we propose four variant models used to predict potential return defaulters for the upcoming tax-filing period. These models have been developed for the Commercial Taxes Department, Government of Telangana, India. Applying our proposed variant models to GST data, we observe a significant increase in savings compared to conventional classifiers. Additionally, we develop an empirical study showing that our approach is more adept at identifying potential tax return defaulters than existing example-dependent cost-sensitive classification algorithms.

查看原文本刊更多论文

一种新的依赖于实例的成本敏感叠加分类器识别纳税申报人

逃税是指从事非法活动以逃避实际纳税义务的行为。纳税申报单是一份包括收入、支出等信息的定期报告。最基本的逃税方法之一是不提交纳税申报表或延迟提交纳税申报表。未在规定期限内申报或未在规定期限内申报的纳税人称为未申报者。因此，政府必须承担因纳税人拖欠税款而造成的财政损失，而每个纳税人的损失情况各不相同。因此，在设计任何统计模型来预测潜在的回报违约者时，我们必须考虑与每个个体的错误分类相关的实际经济损失。本文提出了一种基于样本的代价敏感叠加分类器框架，该分类器使用代价不敏感分类器作为基泛化器对输入空间进行预测。这些预测用于训练依赖于示例的成本敏感元泛化器。基于元推广器的选择，我们提出了四种不同的模型，用于预测即将到来的纳税申报期的潜在违约者。这些模型是为印度特伦加纳邦政府商业税务部门开发的。将我们提出的变体模型应用于GST数据，我们观察到与传统分类器相比，节省了显着增加。此外，我们开发了一项实证研究，表明我们的方法比现有的依赖示例的成本敏感分类算法更擅长识别潜在的纳税申报人。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Business & Information Systems Engineering Computer Science-Information Systems

CiteScore

13.60

自引率

7.60%

发文量

审稿时长

3 months

期刊介绍： Business & Information Systems Engineering (BISE) is a double-blind peer-reviewed journal with a primary focus on the design and utilization of information systems for social welfare. The journal aims to contribute to the understanding and advancement of information systems in ways that benefit societal well-being.