{"title":"联网自动驾驶汽车网络威胁情报建模数据集。","authors":"Yinghui Wang, Yilong Ren, Hongmao Qin, Zhiyong Cui, Yanan Zhao, Haiyang Yu","doi":"10.1038/s41597-025-04439-5","DOIUrl":null,"url":null,"abstract":"<p><p>Cyber attacks pose significant threats to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence (CTI), which involves collecting and analyzing cyber threat information, offers a promising approach to addressing emerging vehicle cyber threats and enabling proactive security defenses. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve CTI modeling is an effective means to ensure automotive cybersecurity. However, the lack of a specialized cybersecurity dataset for automotive CTI knowledge mining has hindered progress in this field. To address this gap, we present a novel corpus specifically designed for vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations. In addition, we conduct a comprehensive analysis of CTI knowledge mining algorithms based on this corpus. Our work provides a valuable resource for enhancing CTI modeling and advancing automotive cybersecurity research.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"366"},"PeriodicalIF":6.9000,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873188/pdf/","citationCount":"0","resultStr":"{\"title\":\"A dataset for cyber threat intelligence modeling of connected autonomous vehicles.\",\"authors\":\"Yinghui Wang, Yilong Ren, Hongmao Qin, Zhiyong Cui, Yanan Zhao, Haiyang Yu\",\"doi\":\"10.1038/s41597-025-04439-5\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Cyber attacks pose significant threats to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence (CTI), which involves collecting and analyzing cyber threat information, offers a promising approach to addressing emerging vehicle cyber threats and enabling proactive security defenses. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve CTI modeling is an effective means to ensure automotive cybersecurity. However, the lack of a specialized cybersecurity dataset for automotive CTI knowledge mining has hindered progress in this field. To address this gap, we present a novel corpus specifically designed for vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations. In addition, we conduct a comprehensive analysis of CTI knowledge mining algorithms based on this corpus. Our work provides a valuable resource for enhancing CTI modeling and advancing automotive cybersecurity research.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"366\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-03-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11873188/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-04439-5\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-04439-5","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
A dataset for cyber threat intelligence modeling of connected autonomous vehicles.
Cyber attacks pose significant threats to connected autonomous vehicles in intelligent transportation systems. Cyber threat intelligence (CTI), which involves collecting and analyzing cyber threat information, offers a promising approach to addressing emerging vehicle cyber threats and enabling proactive security defenses. Obtaining valuable information from enormous cybersecurity data using knowledge extraction technologies to achieve CTI modeling is an effective means to ensure automotive cybersecurity. However, the lack of a specialized cybersecurity dataset for automotive CTI knowledge mining has hindered progress in this field. To address this gap, we present a novel corpus specifically designed for vehicle cybersecurity knowledge mining. This dataset, annotated using a joint labeling strategy, comprises 908 real automotive cybersecurity reports, 8195 security entities and 4852 semantic relations. In addition, we conduct a comprehensive analysis of CTI knowledge mining algorithms based on this corpus. Our work provides a valuable resource for enhancing CTI modeling and advancing automotive cybersecurity research.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.