Marlu da Silva Santos, M. Ladeira, G. V. Erven, Gladston Luiz da Silva
{"title":"Machine Learning Models to Identify the Risk of Modern Slavery in Brazilian Cities","authors":"Marlu da Silva Santos, M. Ladeira, G. V. Erven, Gladston Luiz da Silva","doi":"10.1109/ICMLA.2019.00132","DOIUrl":null,"url":null,"abstract":"The scope of modern slavery encompasses human trafficking, forced labor, debt bondage and child labor. This article proposes the use of predictive models to identify the risk of modern slavery in Brazilian cities using real socioeconomic, demographic and rescue operations data. The study uses the embedded technique with Lasso regularization (L1) to select variables. A comparative analyze of techniques for treatment of imbalanced data was applied and the results indicated the Random Over-Sampling (ROS) as the best one. In total, 16 models are evaluated, consisting of 8 different data sets and two classifiers: Logistic Regression (LR) and Gradient Boosting Machine (GBM). The results indicate that the GBM model has better performance and efficiency, with accuracy of 77%, AUC 80% and G-mean of 71%.","PeriodicalId":436714,"journal":{"name":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2019.00132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
The scope of modern slavery encompasses human trafficking, forced labor, debt bondage and child labor. This article proposes the use of predictive models to identify the risk of modern slavery in Brazilian cities using real socioeconomic, demographic and rescue operations data. The study uses the embedded technique with Lasso regularization (L1) to select variables. A comparative analyze of techniques for treatment of imbalanced data was applied and the results indicated the Random Over-Sampling (ROS) as the best one. In total, 16 models are evaluated, consisting of 8 different data sets and two classifiers: Logistic Regression (LR) and Gradient Boosting Machine (GBM). The results indicate that the GBM model has better performance and efficiency, with accuracy of 77%, AUC 80% and G-mean of 71%.