Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods

International Journal of Artificial Intelligence & Applications Pub Date : 2022-03-31 DOI:10.5121/ijaia.2022.13203

A. Alzayed, Waheeda Almayyan, A. Al-Hunaiyyan

{"title":"Diagnosis of Obesity Level based on Bagging Ensemble Classifier and Feature Selection Methods","authors":"A. Alzayed, Waheeda Almayyan, A. Al-Hunaiyyan","doi":"10.5121/ijaia.2022.13203","DOIUrl":null,"url":null,"abstract":"In the current era, the amount of data generated from various device sources and business transactions is rising exponentially, and the current machine learning techniques are not feasible for handling the massive volume of data. Two commonly adopted schemes exist to solve such issues scaling up the data mining algorithms and data reduction. Scaling the data mining algorithms is not the best way, but data reduction is feasible. There are two approaches to reducing datasets selecting an optimal subset of features from the initial dataset or eliminating those that contribute less information. Overweight and obesity are increasing worldwide, and forecasting future overweight or obesity could help intervention. Our primary objective is to find the optimal subset of features to diagnose obesity. This article proposes adapting a bagging algorithm based on filter-based feature selection to improve the prediction accuracy of obesity with a minimal number of feature subsets. We utilized several machine learning algorithms for classifying the obesity classes and several filter feature selection methods to maximize the classifier accuracy. Based on the results of experiments, Pairwise Consistency and Pairwise Correlation techniques are shown to be promising tools for feature selection in respect of the quality of obtained feature subset and computation efficiency. Analyzing the results obtained from the original and modified datasets has improved the classification accuracy and established a relationship between obesity/overweight and common risk factors such as weight, age, and physical activity patterns.","PeriodicalId":391502,"journal":{"name":"International Journal of Artificial Intelligence & Applications","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Artificial Intelligence & Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5121/ijaia.2022.13203","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

In the current era, the amount of data generated from various device sources and business transactions is rising exponentially, and the current machine learning techniques are not feasible for handling the massive volume of data. Two commonly adopted schemes exist to solve such issues scaling up the data mining algorithms and data reduction. Scaling the data mining algorithms is not the best way, but data reduction is feasible. There are two approaches to reducing datasets selecting an optimal subset of features from the initial dataset or eliminating those that contribute less information. Overweight and obesity are increasing worldwide, and forecasting future overweight or obesity could help intervention. Our primary objective is to find the optimal subset of features to diagnose obesity. This article proposes adapting a bagging algorithm based on filter-based feature selection to improve the prediction accuracy of obesity with a minimal number of feature subsets. We utilized several machine learning algorithms for classifying the obesity classes and several filter feature selection methods to maximize the classifier accuracy. Based on the results of experiments, Pairwise Consistency and Pairwise Correlation techniques are shown to be promising tools for feature selection in respect of the quality of obtained feature subset and computation efficiency. Analyzing the results obtained from the original and modified datasets has improved the classification accuracy and established a relationship between obesity/overweight and common risk factors such as weight, age, and physical activity patterns.

查看原文本刊更多论文

基于Bagging集成分类器和特征选择方法的肥胖水平诊断

在当今时代，各种设备来源和商业交易产生的数据量呈指数级增长，目前的机器学习技术对于处理大量数据是不可行的。解决这类问题的常用方案有两种:数据挖掘算法的扩展和数据约简。扩展数据挖掘算法不是最好的方法，但数据约简是可行的。有两种方法可以减少数据集，从初始数据集中选择一个最优的特征子集或消除那些贡献较少信息的特征子集。超重和肥胖在世界范围内正在增加，预测未来的超重或肥胖可能有助于干预。我们的主要目标是找到诊断肥胖的最佳特征子集。本文提出了一种基于滤波特征选择的bagging算法，以最少的特征子集来提高肥胖的预测精度。我们使用了几种机器学习算法对肥胖类别进行分类，并使用了几种过滤器特征选择方法来最大化分类器的准确性。实验结果表明，从得到的特征子集质量和计算效率两方面来看，两两一致性和两两相关技术是很有前途的特征选择工具。通过分析原始数据集和修改后的数据集获得的结果，提高了分类准确性，并建立了肥胖/超重与体重、年龄和身体活动模式等常见危险因素之间的关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Artificial Intelligence & Applications

自引率

0.00%

发文量