基于大语言模型的智能座舱知识图谱构建。

IF 3.9 2区综合性期刊 Q1 MULTIDISCIPLINARY SCIENCES

Scientific Reports Pub Date : 2025-03-04 DOI:10.1038/s41598-025-92002-y

Haomin Dong, Wenbin Wang, Zhenjiang Sun, Ziyi Kang, Xiaojun Ge, Fei Gao, Jixin Wang

{"title":"基于大语言模型的智能座舱知识图谱构建。","authors":"Haomin Dong, Wenbin Wang, Zhenjiang Sun, Ziyi Kang, Xiaojun Ge, Fei Gao, Jixin Wang","doi":"10.1038/s41598-025-92002-y","DOIUrl":null,"url":null,"abstract":"As intelligent cockpits rapidly evolve towards \"proactive natural interaction,\" traditional rule-based user behavior inference methods are facing scalability, generalization, and accuracy bottlenecks, leading to the development and deployment of functions oriented towards pseudo-demands. Effectively capturing and representing the hidden associative knowledge in intelligent cockpits can enhance the system's understanding of user behavior and environmental contexts, thereby precisely discerning real user needs. In this context, knowledge graphs (KGs) have emerged as an effective tool, enabling the retrieval and organization of vast amounts of information within interconnected and interpretable structures. However, rapidly and flexibly generating domain-specific KGs still poses significant challenges. To address this, this paper introduces a novel knowledge graph construction (KGC) model, GLM-TripleGen, dedicated to analyzing the states and behaviors within intelligent cockpits. This model aims to precisely mine the latent relationships between cockpit state factors and behavioral sequences, effectively addressing key challenges such as the ambiguity in entity recognition and the complexity of relationship extraction within cockpit data. To enhance the adaptability of GLM-TripleGen to the intelligent cockpit domain, this paper constructs an instruction-following dataset based on vehicle states and in-cockpit interaction behaviors, containing a large number of prompt texts paired with corresponding triple labels, to support model fine-tuning. During the fine-tuning process, the Low-Rank Adaptation (LoRA) method is employed to effectively optimize model parameters, significantly reducing training costs. Extensive experiments demonstrate that GLM-TripleGen outperforms existing state-of-the-art KGC methods, accurately generating normalized cockpit triple units. Furthermore, GLM-TripleGen exhibits exceptional robustness and generalization ability, handling various unknown entities and relationships with minimal generalization processing.","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"7635"},"PeriodicalIF":3.9000,"publicationDate":"2025-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11880557/pdf/","citationCount":"0","resultStr":"{\"title\":\"Knowledge graph construction for intelligent cockpits based on large language models.\",\"authors\":\"Haomin Dong, Wenbin Wang, Zhenjiang Sun, Ziyi Kang, Xiaojun Ge, Fei Gao, Jixin Wang\",\"doi\":\"10.1038/s41598-025-92002-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As intelligent cockpits rapidly evolve towards \\\"proactive natural interaction,\\\" traditional rule-based user behavior inference methods are facing scalability, generalization, and accuracy bottlenecks, leading to the development and deployment of functions oriented towards pseudo-demands. Effectively capturing and representing the hidden associative knowledge in intelligent cockpits can enhance the system's understanding of user behavior and environmental contexts, thereby precisely discerning real user needs. In this context, knowledge graphs (KGs) have emerged as an effective tool, enabling the retrieval and organization of vast amounts of information within interconnected and interpretable structures. However, rapidly and flexibly generating domain-specific KGs still poses significant challenges. To address this, this paper introduces a novel knowledge graph construction (KGC) model, GLM-TripleGen, dedicated to analyzing the states and behaviors within intelligent cockpits. This model aims to precisely mine the latent relationships between cockpit state factors and behavioral sequences, effectively addressing key challenges such as the ambiguity in entity recognition and the complexity of relationship extraction within cockpit data. To enhance the adaptability of GLM-TripleGen to the intelligent cockpit domain, this paper constructs an instruction-following dataset based on vehicle states and in-cockpit interaction behaviors, containing a large number of prompt texts paired with corresponding triple labels, to support model fine-tuning. During the fine-tuning process, the Low-Rank Adaptation (LoRA) method is employed to effectively optimize model parameters, significantly reducing training costs. Extensive experiments demonstrate that GLM-TripleGen outperforms existing state-of-the-art KGC methods, accurately generating normalized cockpit triple units. Furthermore, GLM-TripleGen exhibits exceptional robustness and generalization ability, handling various unknown entities and relationships with minimal generalization processing.\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"7635\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-03-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11880557/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-92002-y\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-92002-y","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

随着智能驾驶舱快速向“主动自然交互”发展，传统的基于规则的用户行为推理方法面临着可扩展性、泛化和准确性的瓶颈，导致面向伪需求的功能开发和部署。有效地捕获和表示智能驾驶舱中隐藏的关联知识，可以增强系统对用户行为和环境背景的理解，从而准确识别用户的真实需求。在这种背景下，知识图谱（knowledge graphs, KGs）作为一种有效的工具出现了，它可以在相互关联和可解释的结构中检索和组织大量的信息。然而，快速灵活地生成特定领域的kg仍然面临着重大挑战。为了解决这一问题，本文引入了一种新的知识图谱构建（KGC）模型——GLM-TripleGen，该模型致力于分析智能驾驶舱内部的状态和行为。该模型旨在精确挖掘驾驶舱状态因素与行为序列之间的潜在关系，有效解决驾驶舱数据中实体识别的模糊性和关系提取的复杂性等关键挑战。为了增强GLM-TripleGen对智能座舱领域的适应性，本文基于车辆状态和座舱内交互行为构建指令跟随数据集，其中包含大量与相应三重标签配对的提示文本，以支持模型微调。在微调过程中，采用低秩自适应（Low-Rank Adaptation, LoRA）方法对模型参数进行有效优化，显著降低了训练成本。大量的实验表明，GLM-TripleGen优于现有的最先进的KGC方法，可以准确地生成标准化的座舱三重单元。此外，GLM-TripleGen具有出色的鲁棒性和泛化能力，可以用最少的泛化处理处理各种未知实体和关系。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Knowledge graph construction for intelligent cockpits based on large language models.

As intelligent cockpits rapidly evolve towards "proactive natural interaction," traditional rule-based user behavior inference methods are facing scalability, generalization, and accuracy bottlenecks, leading to the development and deployment of functions oriented towards pseudo-demands. Effectively capturing and representing the hidden associative knowledge in intelligent cockpits can enhance the system's understanding of user behavior and environmental contexts, thereby precisely discerning real user needs. In this context, knowledge graphs (KGs) have emerged as an effective tool, enabling the retrieval and organization of vast amounts of information within interconnected and interpretable structures. However, rapidly and flexibly generating domain-specific KGs still poses significant challenges. To address this, this paper introduces a novel knowledge graph construction (KGC) model, GLM-TripleGen, dedicated to analyzing the states and behaviors within intelligent cockpits. This model aims to precisely mine the latent relationships between cockpit state factors and behavioral sequences, effectively addressing key challenges such as the ambiguity in entity recognition and the complexity of relationship extraction within cockpit data. To enhance the adaptability of GLM-TripleGen to the intelligent cockpit domain, this paper constructs an instruction-following dataset based on vehicle states and in-cockpit interaction behaviors, containing a large number of prompt texts paired with corresponding triple labels, to support model fine-tuning. During the fine-tuning process, the Low-Rank Adaptation (LoRA) method is employed to effectively optimize model parameters, significantly reducing training costs. Extensive experiments demonstrate that GLM-TripleGen outperforms existing state-of-the-art KGC methods, accurately generating normalized cockpit triple units. Furthermore, GLM-TripleGen exhibits exceptional robustness and generalization ability, handling various unknown entities and relationships with minimal generalization processing.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scientific Reports Natural Science Disciplines-

CiteScore

7.50

自引率

4.30%

发文量

19567

审稿时长

3.9 months

期刊介绍： We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections. Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021). •Engineering Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live. •Physical sciences Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics. •Earth and environmental sciences Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems. •Biological sciences Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants. •Health sciences The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.