{"title":"A Simple Formalism for Capturing Reduplication in Finite-State Morphology","authors":"Mans Hulden, Shannon T. Bischoff","doi":"10.3233/978-1-58603-975-2-207","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-207","url":null,"abstract":"This paper presents a simple formalism for capturing reduplication phenomena in the morphology and phonology of natural languages. After a brief survey of the facts common in reduplicative elements cross-linguistically, these facts are described in terms of finite-state systems. The principal idea is that an operator can be derived to ensure equivalence of finite discontinuous strings at some level of representation.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129388882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Learning with Weighted Transducers","authors":"Corinna Cortes, M. Mohri","doi":"10.3233/978-1-58603-975-2-14","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-14","url":null,"abstract":"Weighted finite-state transducers have been used successfully in a variety of natural language processing applications, including speech recognition, speech synthesis, and machine translation. This paper shows how weighted transducers can be combined with existing learning algorithms to form powerful techniques for sequence learning problems.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125927043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finite-State Machines for Mining Patterns in Very Large Text Repositories","authors":"Wojciech Skut","doi":"10.3233/978-1-58603-975-2-23","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-23","url":null,"abstract":"The emergence of WWW search engines since the 1990s has changed the scale of many natural language processing applications. Text mining, information extraction and related tasks can now be applied to tens of billions of documents, which sets new efficiency standards for NLP algorithms. Finite-state machines are an obvious choice of a formal framework for such applications. However, the scale of the problem (size of the searchable corpus, number of patterns to be matched) often poses a problem even to well-established finite-state string matching techniques. In my presentation, I will focus on the experience gained in the implementation a finite-state matching library optimized for searching large amounts of complex patterns in a WWW-scale repository of documents. Both algorithmic and implementation-related aspects of the task will be discussed. The library is based on OpenFST.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130143590","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Compression Method for Natural Language Automata","authors":"L. Tounsi, Béatrice Bouchou-Markhoff, D. Maurel","doi":"10.3233/978-1-58603-975-2-146","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-146","url":null,"abstract":"This paper deals with Finite State Automata used in Natural Language Processing to represent very large dictionaries. We present a method for an important operation applied to these automata, the compression with quick access. Our proposal is to factorize subautomata other than those representing common prefixes or suffixes. Our algorithm uses a DAWG of subautomata to iteratively choose the best substructure to factorize. The linear time accepting complexity is kept in the resulting compact automaton. Experiments performed on ten automata are reported.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122354677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Demaille, A. Duret-Lutz, Florian Lesaint, S. Lombardy, J. Sakarovitch, Florent Terrones
{"title":"An XML Format Proposal for the Description of Weighted Automata, Transducers and Regular Expressions","authors":"A. Demaille, A. Duret-Lutz, Florian Lesaint, S. Lombardy, J. Sakarovitch, Florent Terrones","doi":"10.3233/978-1-58603-975-2-199","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-199","url":null,"abstract":"We present an XML format that allows to describe a large class of finite weighted automata and transducers. Our design choices stem from our policy of making the implementation as simple as possible. This format has been tested for the communication between the modules of our automata manipulation platform Vaucanson, but this document is less an experiment report than a position paper intended to open the discussion among the community of automata software writers.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"311 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115944213","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Optimality Theory and Vector Semirings","authors":"W. Seeker, Daniel Quernheim","doi":"10.3233/978-1-58603-975-2-134","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-134","url":null,"abstract":"As [1] and [2] have shown, some applications of Optimality Theory can be modelled using finite state algebra provided that the constraints are regular. However, their approaches suffered from an upper bound on the number of constraint violations. We present a method to construct finite state transducers which can handle an arbitrary number of constraint violations using a variant of the tropical semiring as its weighting structure. In general, any Optimality Theory system whose constraints can be represented by regular relations, can be modelled this way. Unlike [3], who used roughly the same idea, we can show, that this can be achieved by using only the standard (weighted) automaton algebra.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130506869","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Finite-State Local Grammars for Disambiguating Conjunctions in Portuguese Proper Names","authors":"S. Eleutério, Elisabete Ranchhod","doi":"10.3233/978-1-58603-975-2-62","DOIUrl":"https://doi.org/10.3233/978-1-58603-975-2-62","url":null,"abstract":"Like common noun phrases, proper names contain ambiguous conjoined phrases that make their delimitation and classification difficult in text. This paper presents a finite-state approach to the disambiguation of Portuguese candidate proper name strings containing the coordinating conjunction e (and). In such name strings, the conjunction can denote a relation between two independent names, but it can also be part of a multiword proper name. The coordination of multiword independent names may involve ellipsis of some lexical constituents, which causes additional difficulties to proper name identification and classification.","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121862014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"German Compound Analysis with wfsc","authors":"Anne Schiller","doi":"10.1007/11780885_23","DOIUrl":"https://doi.org/10.1007/11780885_23","url":null,"abstract":"","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115123956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Unsupervised Morphology Induction Using Morfessor","authors":"Mathias Creutz, K. Lagus, Sami Virpioja","doi":"10.1007/11780885_34","DOIUrl":"https://doi.org/10.1007/11780885_34","url":null,"abstract":"","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126929687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Morphological Parsing of Tone: An Experiment with Two-Level Morphology on the Ha Language","authors":"L. Harjula","doi":"10.1007/11780885_30","DOIUrl":"https://doi.org/10.1007/11780885_30","url":null,"abstract":"","PeriodicalId":286427,"journal":{"name":"Finite-State Methods and Natural Language Processing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2005-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132859073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}