{"title":"Predictive modeling of energy demands for battery electric buses using real-world data","authors":"Md Atiqur Rahman, David Holt, Yashar Farajpour, Abdelhamid Mammeri, Hasti Khiabani","doi":"10.1186/s42162-025-00564-y","DOIUrl":null,"url":null,"abstract":"<div><p>The transition to battery electric buses (BEBs) offers a significant opportunity to reduce greenhouse gas (GHG) emissions in public transit. However, the limited driving range of BEBs presents operational challenges, making accurate energy demand prediction essential for effective deployment. Despite advances in machine learning and data-driven modeling, an integrated framework for real-world BEB energy demand prediction remains underdeveloped. Most existing research in this domain relies heavily on simulated or controlled datasets, limiting practical applicability. This study addresses this gap by presenting a comprehensive approach to predicting the energy demands of a BEB fleet under actual service conditions, grounded in real-world operational data collected from the Toronto Transit Commission’s (TTC) BEB trial, one of the largest of its kind in North America. At the core of this approach is a novel data processing framework specifically designed for streaming high-resolution vehicle telematics data, which integrates diverse contextual sources such as weather conditions, route topology, passenger loads, and bus schedules. This integrated framework enables the construction of a large-scale BEB dataset derived from in-service operational data of the TTC’s BEB fleet, encompassing 149,813 hours of driving and 2.56 million kilometers traveled. The dataset is leveraged to train and evaluate several machine learning models to predict energy demands along TTC routes. Results demonstrate that the best-performing model achieves a 38% reduction in mean absolute error compared to a baseline method and explains 87% of the variance in net energy demand. Additionally, an analysis of seasonal effects reveals heightened prediction challenges during colder months, driven by increased variability in energy consumption across different BEB makes and models. Finally, a physics-informed hybrid modeling approach is proposed, which integrates energy estimates from vehicle longitudinal dynamics into the data-driven pipeline, yielding further improvements in prediction accuracy and underscoring the value of domain knowledge in machine learning applications for transit.</p></div>","PeriodicalId":538,"journal":{"name":"Energy Informatics","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://energyinformatics.springeropen.com/counter/pdf/10.1186/s42162-025-00564-y","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Informatics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/s42162-025-00564-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Energy","Score":null,"Total":0}
引用次数: 0
Abstract
The transition to battery electric buses (BEBs) offers a significant opportunity to reduce greenhouse gas (GHG) emissions in public transit. However, the limited driving range of BEBs presents operational challenges, making accurate energy demand prediction essential for effective deployment. Despite advances in machine learning and data-driven modeling, an integrated framework for real-world BEB energy demand prediction remains underdeveloped. Most existing research in this domain relies heavily on simulated or controlled datasets, limiting practical applicability. This study addresses this gap by presenting a comprehensive approach to predicting the energy demands of a BEB fleet under actual service conditions, grounded in real-world operational data collected from the Toronto Transit Commission’s (TTC) BEB trial, one of the largest of its kind in North America. At the core of this approach is a novel data processing framework specifically designed for streaming high-resolution vehicle telematics data, which integrates diverse contextual sources such as weather conditions, route topology, passenger loads, and bus schedules. This integrated framework enables the construction of a large-scale BEB dataset derived from in-service operational data of the TTC’s BEB fleet, encompassing 149,813 hours of driving and 2.56 million kilometers traveled. The dataset is leveraged to train and evaluate several machine learning models to predict energy demands along TTC routes. Results demonstrate that the best-performing model achieves a 38% reduction in mean absolute error compared to a baseline method and explains 87% of the variance in net energy demand. Additionally, an analysis of seasonal effects reveals heightened prediction challenges during colder months, driven by increased variability in energy consumption across different BEB makes and models. Finally, a physics-informed hybrid modeling approach is proposed, which integrates energy estimates from vehicle longitudinal dynamics into the data-driven pipeline, yielding further improvements in prediction accuracy and underscoring the value of domain knowledge in machine learning applications for transit.