{"title":"Exposing model bias in machine learning revisiting the boy who cried wolf in the context of phishing detection","authors":"D. Chaojie, Anuj Gaurav","doi":"10.1080/2573234X.2021.1934128","DOIUrl":null,"url":null,"abstract":"ABSTRACT Grown out of the quest for artificial intelligence (AI), machine learning (ML) is today’s most active field across disciplines with a sharp increase in applications ranging from criminology to fraud detection and to biometrics. ML and statistics both emphasise model estimation/training and thus share the inescapable Type 1 and 2 errors. Extending the concept of statistical errors into the domain of ML, we devise a ground-breaking pH scale-like ratio and intend it as a litmus test indicator of ML model bias completely masked by the popular performance criterion of accuracy. Using publicly available phishing dataset, we conduct experiments on a series of classification models and consequently unravel the significant cost implications of models with varying levels of bias. Based on these results, we recommend practitioners exercise human judgement and match their own risk tolerance profile with the bias ratio associated with each ML model in order to guard against potential unintended adverse effects.","PeriodicalId":36417,"journal":{"name":"Journal of Business Analytics","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2573234X.2021.1934128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 2
Abstract
ABSTRACT Grown out of the quest for artificial intelligence (AI), machine learning (ML) is today’s most active field across disciplines with a sharp increase in applications ranging from criminology to fraud detection and to biometrics. ML and statistics both emphasise model estimation/training and thus share the inescapable Type 1 and 2 errors. Extending the concept of statistical errors into the domain of ML, we devise a ground-breaking pH scale-like ratio and intend it as a litmus test indicator of ML model bias completely masked by the popular performance criterion of accuracy. Using publicly available phishing dataset, we conduct experiments on a series of classification models and consequently unravel the significant cost implications of models with varying levels of bias. Based on these results, we recommend practitioners exercise human judgement and match their own risk tolerance profile with the bias ratio associated with each ML model in order to guard against potential unintended adverse effects.