{"title":"Hallucination Reduction and Optimization for Large Language Model-Based Autonomous Driving","authors":"Jue Wang","doi":"10.3390/sym16091196","DOIUrl":null,"url":null,"abstract":"Large language models (LLMs) are widely integrated into autonomous driving systems to enhance their operational intelligence and responsiveness and improve self-driving vehicles’ overall performance. Despite these advances, LLMs still struggle between hallucinations—when models either misinterpret the environment or generate imaginary parts for downstream use cases—and taxing computational overhead that relegates their performance to strictly non-real-time operations. These are essential problems to solve to make autonomous driving as safe and efficient as possible. This work is thus focused on symmetrical trade-offs between the reduction of hallucination and optimization, leading to a framework for these two combined and at least specifically motivated by these limitations. This framework intends to generate a symmetry of mapping between real and virtual worlds. It helps in minimizing hallucinations and optimizing computational resource consumption reasonably. In autonomous driving tasks, we use multimodal LLMs that combine an image-encoding Visual Transformer (ViT) and a decoding GPT-2 with responses generated by the powerful new sequence generator from OpenAI known as GPT4. Our hallucination reduction and optimization framework leverages iterative refinement loops, RLHF—reinforcement learning from human feedback (RLHF)—along with symmetric performance metrics, e.g., BLEU, ROUGE, and CIDEr similarity scores between machine-generated answers specific to other human reference answers. This ensures that improvements in model accuracy are not overused to the detriment of increased computational overhead. Experimental results show a twofold improvement in decision-maker error rate and processing efficiency, resulting in an overall decrease of 30% for the model and a 25% improvement in processing efficiency across diverse driving scenarios. Not only does this symmetrical approach reduce hallucination, but it also better aligns the virtual and real-world representations.","PeriodicalId":501198,"journal":{"name":"Symmetry","volume":"11 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Symmetry","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/sym16091196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Large language models (LLMs) are widely integrated into autonomous driving systems to enhance their operational intelligence and responsiveness and improve self-driving vehicles’ overall performance. Despite these advances, LLMs still struggle between hallucinations—when models either misinterpret the environment or generate imaginary parts for downstream use cases—and taxing computational overhead that relegates their performance to strictly non-real-time operations. These are essential problems to solve to make autonomous driving as safe and efficient as possible. This work is thus focused on symmetrical trade-offs between the reduction of hallucination and optimization, leading to a framework for these two combined and at least specifically motivated by these limitations. This framework intends to generate a symmetry of mapping between real and virtual worlds. It helps in minimizing hallucinations and optimizing computational resource consumption reasonably. In autonomous driving tasks, we use multimodal LLMs that combine an image-encoding Visual Transformer (ViT) and a decoding GPT-2 with responses generated by the powerful new sequence generator from OpenAI known as GPT4. Our hallucination reduction and optimization framework leverages iterative refinement loops, RLHF—reinforcement learning from human feedback (RLHF)—along with symmetric performance metrics, e.g., BLEU, ROUGE, and CIDEr similarity scores between machine-generated answers specific to other human reference answers. This ensures that improvements in model accuracy are not overused to the detriment of increased computational overhead. Experimental results show a twofold improvement in decision-maker error rate and processing efficiency, resulting in an overall decrease of 30% for the model and a 25% improvement in processing efficiency across diverse driving scenarios. Not only does this symmetrical approach reduce hallucination, but it also better aligns the virtual and real-world representations.