{"title":"Data Lake Architecture for Air Traffic Management","authors":"R. Raju, R. Mital, Daniel M. Finkelsztein","doi":"10.1109/dasc.2018.8569361","DOIUrl":null,"url":null,"abstract":"The air traffic transformation underway in the US with the FAA NextGen and in Europe with SESAR relies on information sharing and system interoperability to increase efficiencies, safety and capacity. The proliferation and dissemination of flight, weather, aeronautical, and environmental data by all air traffic participants represents a treasure trove of air traffic optimization opportunities awaiting to be exploited. Traditional data exploitation methods and tools tend to rely on structured data stores and analytical capability architected to answer defined and current questions. SGT, in collaboration with the US DOT Volpe National Transportation Systems Center, developed a prototype air transportation cloud based Data Lake to harness big data from a variety of sources and build the current and next generation of analytics capability. The Data Lake prototype ingests data from multiple sources including FAA sources like SFDPS, TFMData, TBFM, STDDS, ITWS, and AEDT data sources, and stores it in raw, processed, and refined format. The prototype offers an illustration for how users can realize powerful air traffic related data analysis using structured, unstructured and semi-structured data using open source tools to execute queries, searches, processing streams and to visualize data. Using a combination of traditional SQL and NOSQL, Open-Source and COTS products - PostgreSQL, Elastic-Logstash-Kibana, Apache Kafka, Apache Spark and visualization tools like Tableau, D3 and others, the project shows how analysts can quickly and easily build powerful data pipelines and statistical models.","PeriodicalId":405724,"journal":{"name":"2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE/AIAA 37th Digital Avionics Systems Conference (DASC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/dasc.2018.8569361","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
The air traffic transformation underway in the US with the FAA NextGen and in Europe with SESAR relies on information sharing and system interoperability to increase efficiencies, safety and capacity. The proliferation and dissemination of flight, weather, aeronautical, and environmental data by all air traffic participants represents a treasure trove of air traffic optimization opportunities awaiting to be exploited. Traditional data exploitation methods and tools tend to rely on structured data stores and analytical capability architected to answer defined and current questions. SGT, in collaboration with the US DOT Volpe National Transportation Systems Center, developed a prototype air transportation cloud based Data Lake to harness big data from a variety of sources and build the current and next generation of analytics capability. The Data Lake prototype ingests data from multiple sources including FAA sources like SFDPS, TFMData, TBFM, STDDS, ITWS, and AEDT data sources, and stores it in raw, processed, and refined format. The prototype offers an illustration for how users can realize powerful air traffic related data analysis using structured, unstructured and semi-structured data using open source tools to execute queries, searches, processing streams and to visualize data. Using a combination of traditional SQL and NOSQL, Open-Source and COTS products - PostgreSQL, Elastic-Logstash-Kibana, Apache Kafka, Apache Spark and visualization tools like Tableau, D3 and others, the project shows how analysts can quickly and easily build powerful data pipelines and statistical models.