Technical Perspective of TURL

ACM SIGMOD Record Pub Date : 2022-05-31 DOI:10.1145/3542700.3542708

Paolo Papotti

引用次数: 0

Abstract

Several efforts aim at representing tabular data with neural models for supporting target applications at the intersection of natural language processing (NLP) and databases (DB) [1-3]. The goal is to extend to structured data the recent neural architectures, which achieve state of the art results in NLP applications. Language models (LMs) are usually pre-trained with unsupervised tasks on a large text corpus. The output LM is then fine-tuned on a variety of downstream tasks with a small set of specific examples. This process has many advantages, because the LM contains information about textual structure and content, which are used by the target application without manually defining features.

查看原文本刊更多论文

TURL的技术视角

在自然语言处理(NLP)和数据库(DB)的交叉领域，一些研究旨在用神经模型来表示表格数据，以支持目标应用[1-3]。目标是将最新的神经架构扩展到结构化数据，从而在NLP应用中获得最先进的结果。语言模型(LMs)通常使用大型文本语料库上的无监督任务进行预训练。然后，输出LM使用一小组特定示例对各种下游任务进行微调。这个过程有很多优点，因为LM包含关于文本结构和内容的信息，目标应用程序可以使用这些信息，而无需手动定义特性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM SIGMOD Record

自引率

0.00%

发文量