{"title":"基于逆Kullback-Leibler和动态α调度器的大型语言模型提取与结构化查询语言生成","authors":"Nhat Le , Quan Ninh , Tung Le , Huy Tien Nguyen","doi":"10.1016/j.compeleceng.2025.110607","DOIUrl":null,"url":null,"abstract":"<div><div>With the increasing reliance on data-driven decision-making, the demand for efficient Structured Query Language (SQL) query generation has grown significantly, as it serves as a crucial bridge between natural language and databases. While Large Language Models (LLMs) excel in this task, their high computational costs and inconsistent effectiveness pose significant limitations. This study introduces a knowledge distillation approach to create efficient, high-performing models for SQL generation. By integrating teacher and student model distributions with a dynamic <span><math><mi>α</mi></math></span> scheduler inspired by learning rate schedulers, the method adjusts the teacher’s influence during training, enhancing stability and narrowing performance gaps. Additionally, reverse Kullback–Leibler Divergence (KLD) loss balances contributions, allowing the student model to refine itself while leveraging teacher guidance. The resulting distilled student model, which is 100 times smaller than GPT-4, achieves 80.5% accuracy on benchmark datasets and outperforms GPT-4 in this domain. Furthermore, it demonstrates a 10.2% improvement on extra-hard questions compared to its undistilled counterpart. This work highlights the potential of optimizing LLMs for reduced computational costs and superior performance in SQL query generation and beyond.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"127 ","pages":"Article 110607"},"PeriodicalIF":4.9000,"publicationDate":"2025-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distilling Large Language Models for Structured Query Language generation with reverse Kullback–Leibler and dynamic α scheduler\",\"authors\":\"Nhat Le , Quan Ninh , Tung Le , Huy Tien Nguyen\",\"doi\":\"10.1016/j.compeleceng.2025.110607\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the increasing reliance on data-driven decision-making, the demand for efficient Structured Query Language (SQL) query generation has grown significantly, as it serves as a crucial bridge between natural language and databases. While Large Language Models (LLMs) excel in this task, their high computational costs and inconsistent effectiveness pose significant limitations. This study introduces a knowledge distillation approach to create efficient, high-performing models for SQL generation. By integrating teacher and student model distributions with a dynamic <span><math><mi>α</mi></math></span> scheduler inspired by learning rate schedulers, the method adjusts the teacher’s influence during training, enhancing stability and narrowing performance gaps. Additionally, reverse Kullback–Leibler Divergence (KLD) loss balances contributions, allowing the student model to refine itself while leveraging teacher guidance. The resulting distilled student model, which is 100 times smaller than GPT-4, achieves 80.5% accuracy on benchmark datasets and outperforms GPT-4 in this domain. Furthermore, it demonstrates a 10.2% improvement on extra-hard questions compared to its undistilled counterpart. This work highlights the potential of optimizing LLMs for reduced computational costs and superior performance in SQL query generation and beyond.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"127 \",\"pages\":\"Article 110607\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-08-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0045790625005506\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625005506","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
Distilling Large Language Models for Structured Query Language generation with reverse Kullback–Leibler and dynamic α scheduler
With the increasing reliance on data-driven decision-making, the demand for efficient Structured Query Language (SQL) query generation has grown significantly, as it serves as a crucial bridge between natural language and databases. While Large Language Models (LLMs) excel in this task, their high computational costs and inconsistent effectiveness pose significant limitations. This study introduces a knowledge distillation approach to create efficient, high-performing models for SQL generation. By integrating teacher and student model distributions with a dynamic scheduler inspired by learning rate schedulers, the method adjusts the teacher’s influence during training, enhancing stability and narrowing performance gaps. Additionally, reverse Kullback–Leibler Divergence (KLD) loss balances contributions, allowing the student model to refine itself while leveraging teacher guidance. The resulting distilled student model, which is 100 times smaller than GPT-4, achieves 80.5% accuracy on benchmark datasets and outperforms GPT-4 in this domain. Furthermore, it demonstrates a 10.2% improvement on extra-hard questions compared to its undistilled counterpart. This work highlights the potential of optimizing LLMs for reduced computational costs and superior performance in SQL query generation and beyond.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.