Finite-State Methods and Natural Language Processing最新文献

筛选
英文 中文
Transducer Minimization and Information Compression for NooJ Dictionaries 面向noj词典的传感器最小化与信息压缩
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-110
Slim Mesfar, M. Silberztein
{"title":"Transducer Minimization and Information Compression for NooJ Dictionaries","authors":"Slim Mesfar, M. Silberztein","doi":"10.3233/978-1-58603-975-2-110","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-110","url":null,"abstract":"In this paper, we describe the use of an incremental construction method of minimal, acyclic, deterministic FST. The approach consists in constructing a transducer in a single step by adding new strings one by one and minimizing the resultant automaton incrementally. Then, we present a new method to encode the morphological information associated with the dictionary entries. The new encoding unifies a large number of word forms' analyses, thus reducing the number of terminal states of the dictionary's FST, that triggers a more efficient minimization process. Finally, we present experimental results on the FST that represents the Arabic dictionary.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"25 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125672917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Finite State Models for the Generation of Large Corpora of Natural Language Texts 自然语言文本大语料库生成的有限状态模型
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-175
Domenico Cantone, S. Cristofaro, S. Faro, Emanuele Giaquinta
{"title":"Finite State Models for the Generation of Large Corpora of Natural Language Texts","authors":"Domenico Cantone, S. Cristofaro, S. Faro, Emanuele Giaquinta","doi":"10.3233/978-1-58603-975-2-175","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-175","url":null,"abstract":"Natural languages are probably one of the most common type of input for text processing algorithms. Therefore, it is often desirable to have a large training/testing set of input of this kind, especially when dealing with algorithms tuned for natural language texts. In many cases the problem due to the lack of big corpus of natural language texts can be solved by simply concatenating a set of collected texts, even with heterogeneous contexts and by different authors. \u0000 \u0000In this note we present a preliminary study on a finite state model for text generation which maintains statistical and structural characteristics of natural language texts, i.e., Zipf's law and inverse-rank power law, thus providing a very good approximation for testing purposes.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"258 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132895230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Regular Expressions and Predicate Logic in Finite-State Language Processing 有限状态语言处理中的正则表达式和谓词逻辑
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-82
Mans Hulden
{"title":"Regular Expressions and Predicate Logic in Finite-State Language Processing","authors":"Mans Hulden","doi":"10.3233/978-1-58603-975-2-82","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-82","url":null,"abstract":"This paper proposes an extension to the formalism of regular expressions with a form of predicate logic where quantified propositions apply to substrings. The implementation hinges crucially on the manipulation of auxiliary symbols which has been a common, though previously unsystematized practice in finite-state language processing. We also apply the notation to give alternate compilation methods for two-level grammars and various types of replacement rules found in the literature, and show that, under a certain interpretation, two-level rules and many types of replacement rules are equivalent.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126843421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Representing and Combining Calendar Information by Using Finite-State Transducers 利用有限状态传感器表示和组合日历信息
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-122
J. Niemi, K. Koskenniemi
{"title":"Representing and Combining Calendar Information by Using Finite-State Transducers","authors":"J. Niemi, K. Koskenniemi","doi":"10.3233/978-1-58603-975-2-122","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-122","url":null,"abstract":"This paper elaborates a model for representing various types of semantic calendar expressions (SCEs), which correspond to the disambiguated intensional meanings of natural-language calendar phrases. The model uses finite-state transducers (FSTs) to mark denoted periods of time on a set of timelines also represented as an FST. In addition to an overview of the model, the paper presents methods to combine the periods marked on two timeline FSTs into a single timeline FST and to adjust the granularity and span of time of a timeline FST. The paper also discusses advantages and limitations of the model.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133163039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Making Finite-State Methods Applicable to Languages Beyond Context-Freeness via Multi-dimensional Trees 通过多维树使有限状态方法适用于上下文无关的语言
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-98
Anna Kasprzik
{"title":"Making Finite-State Methods Applicable to Languages Beyond Context-Freeness via Multi-dimensional Trees","authors":"Anna Kasprzik","doi":"10.3233/978-1-58603-975-2-98","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-98","url":null,"abstract":"We provide a new term-like representation for multi-dimensional trees as defined by Rogers [1,2] which establishes them as a direct generalization of classical trees. As a consequence these structures can be used as input for finite-state applications based on classical term-based tree language theory. Via the correspondence between string and tree languages these applications can then be conceived to be able to process even some language classes beyond context-freeness.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131155281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Forest FIRE and FIRE Wood: Tools for Tree Automata and Tree Algorithms 森林火灾和火灾木材:树自动机和树算法的工具
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-191
L. Cleophas
{"title":"Forest FIRE and FIRE Wood: Tools for Tree Automata and Tree Algorithms","authors":"L. Cleophas","doi":"10.3233/978-1-58603-975-2-191","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-191","url":null,"abstract":"Pattern matching, acceptance, and parsing algorithms on node-labeled, ordered, ranked trees ('tree algorithms') are important for applications such as instruction selection and tree transformation/term rewriting. Many such algorithms have been developed. They often are based on results from such algorithms on words or generalizations thereof using finite (tree) automata. Regrettably no coherent, extensive toolkit of such algorithms and automata existed, complicating their use. \u0000 \u0000Our toolkit FOREST FIRE contains many such algorithms and automata constructions. It is accompanied by the graphical user interface (GUI) FIRE WOOD. The toolkit and GUI provide a useful environment for experimenting with and comparing the algorithms. In this tool paper we give an overview of the toolkit and GUI, their context and design rationale, and mention some results obtained with them.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116945193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
CLARIN and Free Open Source Finite-State Tools CLARIN和免费开源有限状态工具
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-3
K. Koskenniemi, Anssi Yli-Jyrä
{"title":"CLARIN and Free Open Source Finite-State Tools","authors":"K. Koskenniemi, Anssi Yli-Jyrä","doi":"10.3233/978-1-58603-975-2-3","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-3","url":null,"abstract":"A new emerging European research infrastructure called CLARIN and a related project called HFST are briefly described. HFST has built a programming interface on top of some existing open source finite-state packages such as SFST and OpenFST. In order to verify its utility, HFST has built open source tools on top of this HFST interface. These tools create lexical transducers, compile morphophonological two-level rules and combine them into a transducer lexicon. The tools have been tested against independently created with full-scale lexicons and rules for Northern Sami and Lule Sami languages which have more complicated lexical and morphophonological structure than most other European languages.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114428930","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Large-Scale Statistical Machine Translation with Weighted Finite State Transducers 基于加权有限状态传感器的大规模统计机器翻译
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-39
Graeme W. Blackwood, A. Gispert, J. Brunning, W. Byrne
{"title":"Large-Scale Statistical Machine Translation with Weighted Finite State Transducers","authors":"Graeme W. Blackwood, A. Gispert, J. Brunning, W. Byrne","doi":"10.3233/978-1-58603-975-2-39","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-39","url":null,"abstract":"The Cambridge University Engineering Department phrase-based statistical machine translation system follows a generative model of translation and is implemented by the composition of component models of translation and movement realised as Weighted Finite State Transducers. Our flexible architecture requires no special purpose decoder and readily handles the large-scale natural language processing demands of state-of-the-art machine translation systems. In this paper we describe the CUED system's participation in the NIST 2008 Arabic-English machine translation evaluation task.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114541836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Event Extraction for Italian Using a Cascade of Finite-State Grammars 使用有限状态语法级联的意大利语事件提取
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-158
Vanni Zavarella, Hristo Tanev, J. Piskorski
{"title":"Event Extraction for Italian Using a Cascade of Finite-State Grammars","authors":"Vanni Zavarella, Hristo Tanev, J. Piskorski","doi":"10.3233/978-1-58603-975-2-158","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-158","url":null,"abstract":"This paper reports on our experience of adapting a real-world live event extraction system based on a cascade of finite-state extraction grammars to the processing of a new language, namely Italian. The real-time event extraction processing chain and the pattern specification language are briefly presented. The major part of the paper focuses on the creation of event extraction grammars and related resources for English and their adaptation for extracting events in Italian news articles. Some interesting phenomena which complicate the event extraction task for Italian are pinpointed and the results of the evaluation are presented. In particular, we compared two versions of the system for Italian, one based on surface-level patterns and a hybrid one, which integrates slightly more linguistically sophisticated patterns for covering a rich variety of morphological and syntactic constructions in Italian.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127016767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Simple Formalism for Capturing Reduplication in Finite-State Morphology 在有限状态形态中捕捉重复的一个简单形式
Finite-State Methods and Natural Language Processing Pub Date : 2009-07-11 DOI: 10.3233/978-1-58603-975-2-207
Mans Hulden, Shannon T. Bischoff
{"title":"A Simple Formalism for Capturing Reduplication in Finite-State Morphology","authors":"Mans Hulden, Shannon T. Bischoff","doi":"10.3233/978-1-58603-975-2-207","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-207","url":null,"abstract":"This paper presents a simple formalism for capturing reduplication phenomena in the morphology and phonology of natural languages. After a brief survey of the facts common in reduplicative elements cross-linguistically, these facts are described in terms of finite-state systems. The principal idea is that an operator can be derived to ensure equivalence of finite discontinuous strings at some level of representation.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129388882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信