{"title":"Comparison and Analysis of Machine Learning Models to Predict Hotel Booking Cancellation","authors":"Yiying Chen, Chuhan Ding, Hanjie Ye, Yuchen Zhou","doi":"10.2991/aebmr.k.220307.225","DOIUrl":null,"url":null,"abstract":"Hotel booking cancellation prediction is crucial in conducting revenue and resource management for hotels. This paper provides three possible substitutes for the neural network including logistic regression, k -Nearest Neighbor ( k -NN), and CatBoost, whereas CatBoost, is the most suitable model for hotels to do the prediction. The advantages of them are effectiveness, high accuracy, and lower cost. The dataset used in this paper was adapted from Kaggle, a set of the booking data from two types of hotels (resort hotel and city hotel) in Portugal, and the corresponding customers’ information. We select some key variables as the predictor to train and test the prediction models based on three machine learning algorithms. After preprocessing the raw data, i.e., standardizing, dealing with missing data, recoding some variables, and scaling, we conduct the prediction and compare each model through three metrics (confusion matrix, accuracy score, and 1 F -score). The result indicates that CatBoost has the best performance in predicting hotel booking cancellation because it has the greatest number of correct prediction samples and the highest accuracy score. We focus on the efficiency and economy of doing cancellation prediction in the hospitality industry to form a basis for future revenue and resource management for hotels.","PeriodicalId":333050,"journal":{"name":"Proceedings of the 2022 7th International Conference on Financial Innovation and Economic Development (ICFIED 2022)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 7th International Conference on Financial Innovation and Economic Development (ICFIED 2022)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2991/aebmr.k.220307.225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Hotel booking cancellation prediction is crucial in conducting revenue and resource management for hotels. This paper provides three possible substitutes for the neural network including logistic regression, k -Nearest Neighbor ( k -NN), and CatBoost, whereas CatBoost, is the most suitable model for hotels to do the prediction. The advantages of them are effectiveness, high accuracy, and lower cost. The dataset used in this paper was adapted from Kaggle, a set of the booking data from two types of hotels (resort hotel and city hotel) in Portugal, and the corresponding customers’ information. We select some key variables as the predictor to train and test the prediction models based on three machine learning algorithms. After preprocessing the raw data, i.e., standardizing, dealing with missing data, recoding some variables, and scaling, we conduct the prediction and compare each model through three metrics (confusion matrix, accuracy score, and 1 F -score). The result indicates that CatBoost has the best performance in predicting hotel booking cancellation because it has the greatest number of correct prediction samples and the highest accuracy score. We focus on the efficiency and economy of doing cancellation prediction in the hospitality industry to form a basis for future revenue and resource management for hotels.