{"title":"A Heuristic ETL Process to Dynamically Separate and Compress AIS Data","authors":"Atefe Sedaghat, M. Kang, Maryam Hamidi","doi":"10.1109/SIEDS58326.2023.10137847","DOIUrl":null,"url":null,"abstract":"Massive vessel trajectory data can be obtained from marine Automatic Identification Systems (AIS) to extract information about water traffic. To efficiently collect and process such a huge amount of data special methods are needed. This study designs a new system for collecting and processing AIS data in a real-time manner. The proposed system not only compresses vessel data while keeping useful information but also adds more attributes to raw trajectory data. The additional attributes include trip id, trip origin/destination, traffic density, and traffic flow. At first, this study presents a dynamic Extract, Transform, and Load (ETL) pipeline that collects AIS messages from vessels, processes those raw data, and loads the processed data in a central database. An optimized algorithm is developed that can process millions of records as fast as possible and send the processed data to production. Next, a user interface is developed to quantify traffic conditions and visualize them in graphs and maps. Finally, Gulf Intercoastal Waterway (GIWW) is considered as study area, where historical and real-time AIS data located in GIWW were collected to test the functionality of the method.","PeriodicalId":267464,"journal":{"name":"2023 Systems and Information Engineering Design Symposium (SIEDS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS58326.2023.10137847","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Massive vessel trajectory data can be obtained from marine Automatic Identification Systems (AIS) to extract information about water traffic. To efficiently collect and process such a huge amount of data special methods are needed. This study designs a new system for collecting and processing AIS data in a real-time manner. The proposed system not only compresses vessel data while keeping useful information but also adds more attributes to raw trajectory data. The additional attributes include trip id, trip origin/destination, traffic density, and traffic flow. At first, this study presents a dynamic Extract, Transform, and Load (ETL) pipeline that collects AIS messages from vessels, processes those raw data, and loads the processed data in a central database. An optimized algorithm is developed that can process millions of records as fast as possible and send the processed data to production. Next, a user interface is developed to quantify traffic conditions and visualize them in graphs and maps. Finally, Gulf Intercoastal Waterway (GIWW) is considered as study area, where historical and real-time AIS data located in GIWW were collected to test the functionality of the method.