{"title":"Accurate Occupancy Detection of an Office Room From Light, Temperature, Humidity and CO2 Measurements Using Statistical Learning Models","authors":"Alex Mirugwe","doi":"10.2139/ssrn.3686755","DOIUrl":null,"url":null,"abstract":"This project aims at developing, validating, and testing several classification statistical models that could predict whether or not an office room is occupied using several data features, namely temperature (◦C), light (lx), humidity (%), CO2 (ppm), and a humidity ratio. The data is modeled using classification techniques i.e. Logistic regression, Classification tree, Bagging-Random forest, and Gradient boosted trees.<br><br>These models were trained and then after evaluated against validation and test sets and using confusion matrices to obtain classification and mis-classification rates. The logistic model was trained using glmnet R package, Tree package for classification tree model, random Forest for both Bagging and Random Forest Models, and gbm package for Gradient Boosted Model.<br><br>The best accuracy was obtained from the Random Forest Model with a classification rate of 93.21% when it was evaluated against the test set. Light sensor is also the most significant variable in predicting whether the office room is occupied or not, this was observed in all the five models.","PeriodicalId":433005,"journal":{"name":"Econometrics: Data Collection & Data Estimation Methodology eJournal","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"85","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Econometrics: Data Collection & Data Estimation Methodology eJournal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3686755","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 85
Abstract
This project aims at developing, validating, and testing several classification statistical models that could predict whether or not an office room is occupied using several data features, namely temperature (◦C), light (lx), humidity (%), CO2 (ppm), and a humidity ratio. The data is modeled using classification techniques i.e. Logistic regression, Classification tree, Bagging-Random forest, and Gradient boosted trees.
These models were trained and then after evaluated against validation and test sets and using confusion matrices to obtain classification and mis-classification rates. The logistic model was trained using glmnet R package, Tree package for classification tree model, random Forest for both Bagging and Random Forest Models, and gbm package for Gradient Boosted Model.
The best accuracy was obtained from the Random Forest Model with a classification rate of 93.21% when it was evaluated against the test set. Light sensor is also the most significant variable in predicting whether the office room is occupied or not, this was observed in all the five models.