{"title":"Prediction of waterborne freight activity with Automatic identification System using Machine learning","authors":"Sanjeev Bhurtyal , Hieu Bui , Sarah Hernandez , Sandra Eksioglu , Magdalena Asborno , Kenneth N. Mitchell , Marin Kress","doi":"10.1016/j.cie.2024.110757","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses latency issues related to publicly available port-level commodity tonnage reports. To predict commodity tonnage at the port-level, near real time vessel tracking data is used with historical Waterborne Commerce Statistics (WCS) with a machine learning model. Currently, commodity throughput is derived from WCS data which is released publicly approximately two years after collection. This latency presents a challenge for short-term planning and other operational uses. To reduce latency, this study leverages near real time vessel tracking data from the Automatic Identification System (AIS) data set. Long Short-Term Memory (LSTM), Temporal Convolutional Network (TCN), and Temporal Fusion Transformer (TFT) machine learning models are developed using the features extracted from AIS and the historical WCS data. The output of the model is the prediction of the quarterly volume of commodities (in tons) at the port terminals for four quarters in the future. Two types of models are developed: (i) <em>uncategorized</em>- a single model trained on all port terminals; (ii) <em>categorized</em>- four models (one per dominant vessel type at the port terminal, i.e., cargo, tanker, tug/tow, and mixed). The <em>uncategorized</em> model outperformed the <em>categorized</em> model based on the Mean Absolute Percentage Error (MAPE). The <em>uncategorized</em> LSTM model has the highest accuracy among all model types. Results show that the model has higher accuracy for port terminals that handle a specific type of vessel, compared to the port terminals that handle more than one vessel type. Six of seven commodity groups have a MAPE of less than 30% under the LSTM uncategorized model framework. The application of the model enables port authorities and stakeholders to make short-term capacity expansion and infrastructure investment decisions based on commodity volume.</div></div>","PeriodicalId":55220,"journal":{"name":"Computers & Industrial Engineering","volume":"200 ","pages":"Article 110757"},"PeriodicalIF":6.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Industrial Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360835224008799","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper addresses latency issues related to publicly available port-level commodity tonnage reports. To predict commodity tonnage at the port-level, near real time vessel tracking data is used with historical Waterborne Commerce Statistics (WCS) with a machine learning model. Currently, commodity throughput is derived from WCS data which is released publicly approximately two years after collection. This latency presents a challenge for short-term planning and other operational uses. To reduce latency, this study leverages near real time vessel tracking data from the Automatic Identification System (AIS) data set. Long Short-Term Memory (LSTM), Temporal Convolutional Network (TCN), and Temporal Fusion Transformer (TFT) machine learning models are developed using the features extracted from AIS and the historical WCS data. The output of the model is the prediction of the quarterly volume of commodities (in tons) at the port terminals for four quarters in the future. Two types of models are developed: (i) uncategorized- a single model trained on all port terminals; (ii) categorized- four models (one per dominant vessel type at the port terminal, i.e., cargo, tanker, tug/tow, and mixed). The uncategorized model outperformed the categorized model based on the Mean Absolute Percentage Error (MAPE). The uncategorized LSTM model has the highest accuracy among all model types. Results show that the model has higher accuracy for port terminals that handle a specific type of vessel, compared to the port terminals that handle more than one vessel type. Six of seven commodity groups have a MAPE of less than 30% under the LSTM uncategorized model framework. The application of the model enables port authorities and stakeholders to make short-term capacity expansion and infrastructure investment decisions based on commodity volume.
期刊介绍:
Computers & Industrial Engineering (CAIE) is dedicated to researchers, educators, and practitioners in industrial engineering and related fields. Pioneering the integration of computers in research, education, and practice, industrial engineering has evolved to make computers and electronic communication integral to its domain. CAIE publishes original contributions focusing on the development of novel computerized methodologies to address industrial engineering problems. It also highlights the applications of these methodologies to issues within the broader industrial engineering and associated communities. The journal actively encourages submissions that push the boundaries of fundamental theories and concepts in industrial engineering techniques.