A Graph-Based Framework for Structured Prediction Tasks in Sanskrit

IF 5.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics Pub Date : 2020-10-22 DOI:10.1162/coli_a_00390

A. Krishna, Ashim Gupta, Pawan Goyal, Bishal Santra, Pavankumar Satuluri

{"title":"A Graph-Based Framework for Structured Prediction Tasks in Sanskrit","authors":"A. Krishna, Ashim Gupta, Pawan Goyal, Bishal Santra, Pavankumar Satuluri","doi":"10.1162/coli_a_00390","DOIUrl":null,"url":null,"abstract":"Abstract We propose a framework using energy-based models for multiple structured prediction tasks in Sanskrit. Ours is an arc-factored model, similar to the graph-based parsing approaches, and we consider the tasks of word segmentation, morphological parsing, dependency parsing, syntactic linearization, and prosodification, a “prosody-level” task we introduce in this work. Ours is a search-based structured prediction framework, which expects a graph as input, where relevant linguistic information is encoded in the nodes, and the edges are then used to indicate the association between these nodes. Typically, the state-of-the-art models for morphosyntactic tasks in morphologically rich languages still rely on hand-crafted features for their performance. But here, we automate the learning of the feature function. The feature function so learned, along with the search space we construct, encode relevant linguistic information for the tasks we consider. This enables us to substantially reduce the training data requirements to as low as 10%, as compared to the data requirements for the neural state-of-the-art models. Our experiments in Czech and Sanskrit show the language-agnostic nature of the framework, where we train highly competitive models for both the languages. Moreover, our framework enables us to incorporate language-specific constraints to prune the search space and to filter the candidates during inference. We obtain significant improvements in morphosyntactic tasks for Sanskrit by incorporating language-specific constraints into the model. In all the tasks we discuss for Sanskrit, we either achieve state-of-the-art results or ours is the only data-driven solution for those tasks.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"46 1","pages":"785-845"},"PeriodicalIF":5.3000,"publicationDate":"2020-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational Linguistics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/coli_a_00390","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 19

Abstract

Abstract We propose a framework using energy-based models for multiple structured prediction tasks in Sanskrit. Ours is an arc-factored model, similar to the graph-based parsing approaches, and we consider the tasks of word segmentation, morphological parsing, dependency parsing, syntactic linearization, and prosodification, a “prosody-level” task we introduce in this work. Ours is a search-based structured prediction framework, which expects a graph as input, where relevant linguistic information is encoded in the nodes, and the edges are then used to indicate the association between these nodes. Typically, the state-of-the-art models for morphosyntactic tasks in morphologically rich languages still rely on hand-crafted features for their performance. But here, we automate the learning of the feature function. The feature function so learned, along with the search space we construct, encode relevant linguistic information for the tasks we consider. This enables us to substantially reduce the training data requirements to as low as 10%, as compared to the data requirements for the neural state-of-the-art models. Our experiments in Czech and Sanskrit show the language-agnostic nature of the framework, where we train highly competitive models for both the languages. Moreover, our framework enables us to incorporate language-specific constraints to prune the search space and to filter the candidates during inference. We obtain significant improvements in morphosyntactic tasks for Sanskrit by incorporating language-specific constraints into the model. In all the tasks we discuss for Sanskrit, we either achieve state-of-the-art results or ours is the only data-driven solution for those tasks.

查看原文本刊更多论文

基于图的结构化预测任务框架

摘要我们提出了一个框架，使用基于能量的模型在梵语中执行多个结构化预测任务。我们的模型是一个弧因子模型，类似于基于图的解析方法，我们将分词、形态解析、依赖解析、句法线性化和韵律化任务视为我们在本工作中引入的“韵律级”任务。我们的是一个基于搜索的结构化预测框架，它期望一个图作为输入，其中相关的语言信息被编码在节点中，然后边缘被用来指示这些节点之间的关联。通常，在形态丰富的语言中，最先进的形态句法任务模型仍然依赖于手工制作的功能来实现其性能。但在这里，我们自动学习特征函数。如此学习的特征函数，以及我们构建的搜索空间，为我们考虑的任务编码相关的语言信息。与最先进的神经模型的数据要求相比，这使我们能够将训练数据要求大幅降低至低至10%。我们在捷克语和梵语中的实验表明了该框架的语言不可知性，我们为这两种语言训练了极具竞争力的模型。此外，我们的框架使我们能够结合特定于语言的约束来修剪搜索空间，并在推理过程中过滤候选者。通过在模型中加入特定语言的约束，我们在梵语的形态句法任务中获得了显著的改进。在我们为梵语讨论的所有任务中，我们要么取得了最先进的结果，要么我们的解决方案是这些任务的唯一数据驱动解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computational Linguistics 工程技术-计算机：跨学科应用

CiteScore

15.80

自引率

0.00%

发文量

审稿时长

>12 weeks

期刊介绍： Computational Linguistics, the longest-running publication dedicated solely to the computational and mathematical aspects of language and the design of natural language processing systems, provides university and industry linguists, computational linguists, AI and machine learning researchers, cognitive scientists, speech specialists, and philosophers with the latest insights into the computational aspects of language research.