{"title":"A novel ensemble approach for heterogeneous data with active learning","authors":"M. Salama, Hatem M. Abdelkader, A. Abdelwahab","doi":"10.1177/18479790221082605","DOIUrl":null,"url":null,"abstract":"At present, millions of internet users are contributing a huge amount of data. This data is extremely heterogeneous, and so, it is hard to analyze and derive information from this data that is considered an indispensable source for decision-makers. Due to this massive growth, the classification of data and analysis has become an important research subject. Extracting information from this data has become a necessity. As a result, it was necessary to process these enormous volumes of data to uncover hidden information and therefore improve data analysis and, in turn, classification accuracy. In this paper, firstly, we focus on developing an ensemble machine-learning model based on active learning which identifies the most effective feature extraction strategy for heterogeneous data analysis, and compare it with traditional machine-learning algorithms. Secondly, we evaluate the proposed model during the experiments; five heterogeneous datasets from various domains were used, such as a Health Care Reform dataset, Sander Frandsen dataset, Financial Phrase Bank dataset, SMS Spam Collection dataset, and Textbook sales dataset. According to the results, the novel approach for data analysis performed better than conventional methods. Finally, the study’s findings confirmed the validity of the suggested technique, meeting the study’s goal of using ensemble methods with active learning to raise the model’s overall accuracy for effectively classifying and analyzing heterogeneous data, reducing the time and money spent training the model, and delivering superior analysis performance as well as insights into other elements of extracting information from heterogeneous data.","PeriodicalId":45882,"journal":{"name":"International Journal of Engineering Business Management","volume":null,"pages":null},"PeriodicalIF":4.9000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Engineering Business Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1177/18479790221082605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BUSINESS","Score":null,"Total":0}
引用次数: 7
Abstract
At present, millions of internet users are contributing a huge amount of data. This data is extremely heterogeneous, and so, it is hard to analyze and derive information from this data that is considered an indispensable source for decision-makers. Due to this massive growth, the classification of data and analysis has become an important research subject. Extracting information from this data has become a necessity. As a result, it was necessary to process these enormous volumes of data to uncover hidden information and therefore improve data analysis and, in turn, classification accuracy. In this paper, firstly, we focus on developing an ensemble machine-learning model based on active learning which identifies the most effective feature extraction strategy for heterogeneous data analysis, and compare it with traditional machine-learning algorithms. Secondly, we evaluate the proposed model during the experiments; five heterogeneous datasets from various domains were used, such as a Health Care Reform dataset, Sander Frandsen dataset, Financial Phrase Bank dataset, SMS Spam Collection dataset, and Textbook sales dataset. According to the results, the novel approach for data analysis performed better than conventional methods. Finally, the study’s findings confirmed the validity of the suggested technique, meeting the study’s goal of using ensemble methods with active learning to raise the model’s overall accuracy for effectively classifying and analyzing heterogeneous data, reducing the time and money spent training the model, and delivering superior analysis performance as well as insights into other elements of extracting information from heterogeneous data.
期刊介绍:
The International Journal of Engineering Business Management (IJEBM) is an international, peer-reviewed, open access scientific journal that aims to promote an integrated and multidisciplinary approach to engineering, business and management. The journal focuses on issues related to the design, development and implementation of new methodologies and technologies that contribute to strategic and operational improvements of organizations within the contemporary global business environment. IJEBM encourages a systematic and holistic view in order to ensure an integrated and economically, socially and environmentally friendly approach to management of new technologies in business. It aims to be a world-class research platform for academics, managers, and professionals to publish scholarly research in the global arena. All submitted articles considered suitable for the International Journal of Engineering Business Management are subjected to rigorous peer review to ensure the highest levels of quality. The review process is carried out as quickly as possible to minimize any delays in the online publication of articles. Topics of interest include, but are not limited to: -Competitive product design and innovation -Operations and manufacturing strategy -Knowledge management and knowledge innovation -Information and decision support systems -Radio Frequency Identification -Wireless Sensor Networks -Industrial engineering for business improvement -Logistics engineering and transportation -Modeling and simulation of industrial and business systems -Quality management and Six Sigma -Automation of industrial processes and systems -Manufacturing performance and productivity measurement -Supply Chain Management and the virtual enterprise network -Environmental, legal and social aspects -Technology Capital and Financial Modelling -Engineering Economics and Investment Theory -Behavioural, Social and Political factors in Engineering