{"title":"An efficient over sampled approach for handling imbalanced data using diversified distribution","authors":"G. Shobana, B. Battula","doi":"10.1109/ICICI.2017.8365232","DOIUrl":null,"url":null,"abstract":"Data mining is the process of finding unknown relations from the databases. In Data mining, classification is the branch of learning which deals with the labeled instances. The existing classification algorithms are not efficient on imbalance datasets. In this paper, we propose a novel algorithm known as Over Sampling using Diversified Distribution (OSDD), to overcome the problem of class imbalance learning. The OSDD algorithm identifies the unique diversified distributions for efficient oversampling. The experimental results suggest that the proposed approach performs better than the compared approach in terms of AUC, precision, recall and f-measure.","PeriodicalId":369524,"journal":{"name":"2017 International Conference on Inventive Computing and Informatics (ICICI)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Inventive Computing and Informatics (ICICI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICI.2017.8365232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data mining is the process of finding unknown relations from the databases. In Data mining, classification is the branch of learning which deals with the labeled instances. The existing classification algorithms are not efficient on imbalance datasets. In this paper, we propose a novel algorithm known as Over Sampling using Diversified Distribution (OSDD), to overcome the problem of class imbalance learning. The OSDD algorithm identifies the unique diversified distributions for efficient oversampling. The experimental results suggest that the proposed approach performs better than the compared approach in terms of AUC, precision, recall and f-measure.