Finite-State Methods and Natural Language Processing最新文献_第2页

Grammatical Framework: an Interlingual Grammar Formalism 语法框架:语际语法形式主义

Finite-State Methods and Natural Language Processing Pub Date : 2019-09-01 DOI: 10.18653/v1/W19-3101

Aarne Ranta

{"title":"Grammatical Framework: an Interlingual Grammar Formalism","authors":"Aarne Ranta","doi":"10.18653/v1/W19-3101","DOIUrl":"https://doi.org/10.18653/v1/W19-3101","url":null,"abstract":"Grammatical Framework (GF) was born at Xerox Research Centre Europe in 1998. Its purpose was to provide a declarative grammar formalism for interlingual translation systems. The core of GF is Constructive Type Theory (CTT), also known as Logical Framework, which is used for building interlingual representations. On top of these representations, GF provides a functional programming language for defining reversible mappings from interlinguas to concrete languages, equivalent to Parallel Multiple Context-Free Grammars (PMCFG). Open-source since 1999, GF has a world-wide community that has built comprehensive grammars for over 40 languages. GF is also used in several companies to build applications for translation, natural language generation, semantic analysis, chatbots, and dialogue systems. The focus has been on Controlled Natural Languages (CNL), but recent research has also combined GF with statistical and machine learning techniques, such as neural dependency parsing. In this way, GF can scale up to robust and wide-coverage language processing, without sacrificing explainability. The tutorial is meant for an audience that has some experience with formal language theory and its use in practical implementations. However, it is self-contained and does not assume specific knowledge such as CTT or PMCFG. The structure is the following: 1. Hands-on introduction (45 min). Interactive coding in the GF Cloud to get an idea of how GF works. 2. Theoretical background (45 min). GF as a formalism and programming language, with references to its main inspirations (constructive type theory, Montague grammar, categorial grammars, XFST) 3. The GF Ecosystem (30 min). Software tools, on-going academic research, commercial applications, and open-source community activities.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131778710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Latent Variable Grammars for Discontinuous Parsing 不连续解析的潜在变量语法

Finite-State Methods and Natural Language Processing Pub Date : 2019-09-01 DOI: 10.18653/v1/W19-3103

Kilian Gebhardt

{"title":"Latent Variable Grammars for Discontinuous Parsing","authors":"Kilian Gebhardt","doi":"10.18653/v1/W19-3103","DOIUrl":"https://doi.org/10.18653/v1/W19-3103","url":null,"abstract":"Latent variable context-free grammars are powerful models for predicting the syntactic structure of sentences (Matsuzaki, Miyao, and Tsujii 2005; Petrov, Barrett, et al. 2006; Petrov and Klein 2007). When trained on annotated corpora, the resulting latent variables can be shown to capture different distributions for, e.g., NPs in subject and object position. Several languages (and in consequence also syntactic treebanks for these languages) such as Dutch (Lassy van Noord 2009), German (NeGra, Skut et al. 1997; TiGer Brants et al. 2004), but also English (Penn Treebank, Marcus, Santorini, and Marcinkiewicz 1993, Evang and Kallmeyer 2011) contain structures that cannot be adequately modelled by context-free grammars. In consequence, a class of more power grammar formalisms called mildly context-sensitive has been studied (cf. Kallmeyer 2010). Although parsing with these models is polynomial in the length of the input sentence (Seki et al. 1991), it has for a long time been regarded prohibitively slow. However, in recent years it was shown that the application of mildly-context sensitive grammars is feasible in coarse-to-fine parsing approaches (van Cranenburgh 2012; Ruprecht and Denkinger 2019). In this talk I consider how both the latent variable approach and mildly context-sensitive grammars can be joined and applied to discontinuous treebanks: 1. A large class of latent variable grammars can be captured as a probabilistic regular tree grammar combined with an algebra. I show how the training methodology of latent variable PCFG can be generalized for this class. 2. I recall two mildly context-sensitive grammar formalisms: linear context-free rewriting systems (LCFRS, Vijay-Shanker, Weir, and Joshi 1987) and hybrid grammars (Nederhof and Vogler 2014; Gebhardt, Nederhof, and Vogler 2017). In particular, I consider the induction of hybrid grammars, which can be parametrized such that the polynomial complexity of parsing is of bounded degree. This way also hybrid grammars that are structurally equivalent to finite state automata can be obtained. 3. I analyse different trends when training latent variable LCFRS and hybrid grammars on different discontinuous treebanks and applying them for parsing.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131221197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MSO with tests and reducts 带有测试和缩减的MSO

Finite-State Methods and Natural Language Processing Pub Date : 2019-09-01 DOI: 10.18653/v1/W19-3106

T. Fernando, David Woods, Carl Vogel

引用次数: 1

A FST Description of Noun and Verb Morphology of Azarbaijani Turkish 阿塞拜疆语名词和动词形态的FST描述

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4008

R. Ehsani, Berke Özenç, E. Solak

引用次数: 2

Finite-State Morphological Analysis for Marathi 马拉地语的有限态形态学分析

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4006

Vinit Ravishankar, Francis M. Tyers

引用次数: 5

Evaluating an Automata Approach to Query Containment 评估查询包含的自动机方法

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4010

Michael Jason Minock

引用次数: 0

Multi-tape Computing with Synchronous Relations 同步关系的多磁带计算

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4005

C. Wurm, Simon Petitjean

引用次数: 0

Failure Transducers and Applications in Knowledge-Based Text Processing 故障传感器及其在基于知识的文本处理中的应用

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4001

S. Mihov, K. Schulz

引用次数: 0

Word Transduction for Addressing the OOV Problem in Machine Translation for Similar Resource-Scarce Languages 解决类似资源稀缺语言机器翻译中OOV问题的词转导

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4009

Anssi Yli-Jyrä

引用次数: 7

Harmonic Serialism and Finite-State Optimality Theory 调和序列论与有限状态最优论

Finite-State Methods and Natural Language Processing Pub Date : 2017-09-01 DOI: 10.18653/v1/W17-4003

Sophie Hao

引用次数: 2