An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries

Proceedings of the 28th International Conference on Intelligent User Interfaces Pub Date : 2023-03-27 DOI:10.1145/3581641.3584067

Zheng Ning, Zheng Zhang, Tianyi Sun, Yuan Tian, Tianyi Zhang, Toby Jia-Jun Li

{"title":"An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries","authors":"Zheng Ning, Zheng Zhang, Tianyi Sun, Yuan Tian, Tianyi Zhang, Toby Jia-Jun Li","doi":"10.1145/3581641.3584067","DOIUrl":null,"url":null,"abstract":"Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 28th International Conference on Intelligent User Interfaces","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3581641.3584067","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

Abstract

Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.

查看原文本刊更多论文

自然语言数据库查询中模型错误和用户错误发现与修复策略的实证研究

机器学习(ML)和自然语言处理(NLP)的最新进展导致了结构化数据库(NL2SQL)的自然语言接口的显著改进。尽管取得了很大的进步，但NL2SQL模型的整体准确性仍然远远不够完美(在Spider基准测试中约为75%)。在实践中，这需要用户识别模型生成的错误SQL查询，并在使用NL2SQL模型时手动修复它们。目前，对自动生成sql中的常见错误以及识别和修复这些错误的有效策略缺乏全面的了解。为了弥补差距，我们(1)对三个最先进的NL2SQL模型所犯的错误进行了深入分析;(2)提取了NL2SQL模型错误的分类;(3)对26名参与者进行了一项主题内用户研究，以调查NL2SQL中三种具有代表性的错误发现和修复交互机制的有效性。本文的研究结果为未来自然语言数据查询接口的错误发现和修复策略的设计提供了启示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 28th International Conference on Intelligent User Interfaces

自引率

0.00%

发文量