{"title":"traffic - it:增强对多模态大语言模型的交通场景理解","authors":"Senyun Kuang , Yang Liu , Xiaobo Qu , Yintao Wei","doi":"10.1016/j.trc.2025.105325","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the convergence of artificial intelligence and urban infrastructure has driven transformative advances in intelligent transportation systems (ITS). However, traditional models often lack the generalizability needed to adapt to diverse traffic scenarios. Multimodal large language models (MLLMs) offer a promising solution, yet they are typically trained on general datasets, limiting their effectiveness in specific transportation contexts. To address this, we introduce Traffic-IT, a dataset comprising 220,950 question-and-answer pairs from 30,000 images, designed to enhance MLLMs’ capabilities in traffic scene understanding. The dataset covers various traffic scenarios, including weather conditions, locations, and times of day, providing in-depth insights and driving strategies tailored to real-world needs. Created through expert consultation and rigorous validation, Traffic-IT significantly improves MLLMs’ performance in interpreting complex traffic scenes. We anticipate that Traffic-IT will be a crucial resource for future developments in smart city applications.</div></div>","PeriodicalId":54417,"journal":{"name":"Transportation Research Part C-Emerging Technologies","volume":"180 ","pages":"Article 105325"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Traffic-IT: Enhancing traffic scene understanding for multimodal large language models\",\"authors\":\"Senyun Kuang , Yang Liu , Xiaobo Qu , Yintao Wei\",\"doi\":\"10.1016/j.trc.2025.105325\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In recent years, the convergence of artificial intelligence and urban infrastructure has driven transformative advances in intelligent transportation systems (ITS). However, traditional models often lack the generalizability needed to adapt to diverse traffic scenarios. Multimodal large language models (MLLMs) offer a promising solution, yet they are typically trained on general datasets, limiting their effectiveness in specific transportation contexts. To address this, we introduce Traffic-IT, a dataset comprising 220,950 question-and-answer pairs from 30,000 images, designed to enhance MLLMs’ capabilities in traffic scene understanding. The dataset covers various traffic scenarios, including weather conditions, locations, and times of day, providing in-depth insights and driving strategies tailored to real-world needs. Created through expert consultation and rigorous validation, Traffic-IT significantly improves MLLMs’ performance in interpreting complex traffic scenes. We anticipate that Traffic-IT will be a crucial resource for future developments in smart city applications.</div></div>\",\"PeriodicalId\":54417,\"journal\":{\"name\":\"Transportation Research Part C-Emerging Technologies\",\"volume\":\"180 \",\"pages\":\"Article 105325\"},\"PeriodicalIF\":7.6000,\"publicationDate\":\"2025-09-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Research Part C-Emerging Technologies\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0968090X25003298\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Part C-Emerging Technologies","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0968090X25003298","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Traffic-IT: Enhancing traffic scene understanding for multimodal large language models
In recent years, the convergence of artificial intelligence and urban infrastructure has driven transformative advances in intelligent transportation systems (ITS). However, traditional models often lack the generalizability needed to adapt to diverse traffic scenarios. Multimodal large language models (MLLMs) offer a promising solution, yet they are typically trained on general datasets, limiting their effectiveness in specific transportation contexts. To address this, we introduce Traffic-IT, a dataset comprising 220,950 question-and-answer pairs from 30,000 images, designed to enhance MLLMs’ capabilities in traffic scene understanding. The dataset covers various traffic scenarios, including weather conditions, locations, and times of day, providing in-depth insights and driving strategies tailored to real-world needs. Created through expert consultation and rigorous validation, Traffic-IT significantly improves MLLMs’ performance in interpreting complex traffic scenes. We anticipate that Traffic-IT will be a crucial resource for future developments in smart city applications.
期刊介绍:
Transportation Research: Part C (TR_C) is dedicated to showcasing high-quality, scholarly research that delves into the development, applications, and implications of transportation systems and emerging technologies. Our focus lies not solely on individual technologies, but rather on their broader implications for the planning, design, operation, control, maintenance, and rehabilitation of transportation systems, services, and components. In essence, the intellectual core of the journal revolves around the transportation aspect rather than the technology itself. We actively encourage the integration of quantitative methods from diverse fields such as operations research, control systems, complex networks, computer science, and artificial intelligence. Join us in exploring the intersection of transportation systems and emerging technologies to drive innovation and progress in the field.