Special Interest Group on Computational Morphology and Phonology Workshop最新文献

筛选
英文 中文
Dynamic Correspondences: An Object-Oriented Approach to Tracking Sound Reconstructions 动态对应:跟踪声音重建的面向对象方法
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2007-06-28 DOI: 10.3115/1626516.1626532
Tyler Peterson, Gessiane Picanco
{"title":"Dynamic Correspondences: An Object-Oriented Approach to Tracking Sound Reconstructions","authors":"Tyler Peterson, Gessiane Picanco","doi":"10.3115/1626516.1626532","DOIUrl":"https://doi.org/10.3115/1626516.1626532","url":null,"abstract":"This paper reports the results of a research project that experiments with crosstabulation in aiding phonemic reconstruction. Data from the Tupi stock was used, and three tests were conducted in order to determine the efficacy of this application: the confirmation and challenging of a previously established reconstruction in the family; testing a new reconstruction generated by our model; and testing the upper limit of simultaneous, multiple correspondences across several languages. Our conclusion is that the use of cross tabulations (implemented within a database as pivot tables) offers an innovative and effective tool in comparative study and sound reconstruction.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2007-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120954171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A Combined Phonetic-Phonological Approach to Estimating Cross-Language Phoneme Similarity in an ASR Environment 一种语音-音系结合的方法估算ASR环境下跨语言音素相似性
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622166
L. Melnar, Chen Liu
{"title":"A Combined Phonetic-Phonological Approach to Estimating Cross-Language Phoneme Similarity in an ASR Environment","authors":"L. Melnar, Chen Liu","doi":"10.3115/1622165.1622166","DOIUrl":"https://doi.org/10.3115/1622165.1622166","url":null,"abstract":"This paper presents a fully automated linguistic approach to measuring distance between phonemes across languages. In this approach, a phoneme is represented by a feature matrix where feature categories are fixed, hierarchically related and binary-valued; feature categorization explicitly addresses allophonic variation and feature values are weighted based on their relative prominence derived from lexical frequency measurements. The relative weight of feature values is factored into phonetic distance calculation. Two phonological distances are statistically derived from lexical frequency measurements. The phonetic distance is combined with the phonological distances to produce a single metric that quantifies cross-language phoneme distance. \u0000 \u0000The performances of target-language phoneme HMMs constructed solely with source language HMMs, first selected by the combined phonetic and phonological metric and then by a data-driven, acoustics distance-based method, are compared in context-independent automatic speech recognition (ASR) experiments. Results show that this approach consistently performs equivalently to the acoustics-based approach, confirming its effectiveness in estimating cross-language similarity between phonemes in an ASR environment.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124131951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Richness of the Base and Probabilistic Unsupervised Learning in Optimality Theory 最优性理论中基础的丰富性与概率无监督学习
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622172
G. Jarosz
{"title":"Richness of the Base and Probabilistic Unsupervised Learning in Optimality Theory","authors":"G. Jarosz","doi":"10.3115/1622165.1622172","DOIUrl":"https://doi.org/10.3115/1622165.1622172","url":null,"abstract":"This paper proposes an unsupervised learning algorithm for Optimality Theoretic grammars, which learns a complete constraint ranking and a lexicon given only unstructured surface forms and morphological relations. The learning algorithm, which is based on the Expectation-Maximization algorithm, gradually maximizes the likelihood of the observed forms by adjusting the parameters of a probabilistic constraint grammar and a probabilistic lexicon. The paper presents the algorithm's results on three constructed language systems with different types of hidden structure: voicing neutralization, stress, and abstract vowels. In all cases the algorithm learns the correct constraint ranking and lexicon. The paper argues that the algorithm's ability to identify correct, restrictive grammars is due in part to its explicit reliance on the Optimality Theoretic notion of Richness of the Base.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131626351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Learning Probabilistic Paradigms for Morphology in a Latent Class Model 潜在类模型中形态学的概率学习范式
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622174
Erwin Chan
{"title":"Learning Probabilistic Paradigms for Morphology in a Latent Class Model","authors":"Erwin Chan","doi":"10.3115/1622165.1622174","DOIUrl":"https://doi.org/10.3115/1622165.1622174","url":null,"abstract":"This paper introduces the probabilistic paradigm, a probabilistic, declarative model of morphological structure. We describe an algorithm that recursively applies Latent Dirichlet Allocation with an orthogonality constraint to discover morphological paradigms as the latent classes within a suffix-stem matrix. We apply the algorithm to data preprocessed in several different ways, and show that when suffixes are distinguished for part of speech and allomorphs or gender/conjugational variants are merged, the model is able to correctly learn morphological paradigms for English and Spanish. We compare our system with Linguistica (Goldsmith 2001), and discuss the advantages of the probabilistic paradigm over Linguistica's signature representation.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124993025","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
Learning Quantity Insensitive Stress Systems via Local Inference 基于局部推理的学习量不敏感应力系统
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622168
Jeffrey Heinz
{"title":"Learning Quantity Insensitive Stress Systems via Local Inference","authors":"Jeffrey Heinz","doi":"10.3115/1622165.1622168","DOIUrl":"https://doi.org/10.3115/1622165.1622168","url":null,"abstract":"This paper presents an unsupervised batch learner for the quantity-insensitive stress systems described in Gordon (2002). Unlike previous stress learning models, the learner presented here is neither cue based (Dresher and Kaye, 1990), nor reliant on a priori Optimality-theoretic constraints (Tesar, 1998). Instead our learner exploits a property called neighborhood-distinctness, which is shared by all of the target patterns. Some consequences of this approach include a natural explanation for the occurrence of binary and ternary rhythmic patterns, the lack of higher n-ary rhythms, and the fact that, in these systems, stress always falls within a certain window of word edges.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Improved morpho-phonological sequence processing with constraint satisfaction inference 基于约束满意推理的改进词音序列处理
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622171
Antal van den Bosch, S. Canisius
{"title":"Improved morpho-phonological sequence processing with constraint satisfaction inference","authors":"Antal van den Bosch, S. Canisius","doi":"10.3115/1622165.1622171","DOIUrl":"https://doi.org/10.3115/1622165.1622171","url":null,"abstract":"In performing morpho-phonological sequence processing tasks, such as letter-phoneme conversion or morphological analysis, it is typically not enough to base the output sequence on local decisions that map local-context input windows to single output tokens. We present a global sequence-processing method that repairs inconsistent local decisions. The approach is based on local predictions of overlapping trigrams of output tokens, which open up a space of possible sequences; a data-driven constraint satisfaction inference step then searches for the optimal output sequence. We demonstrate significant improvements in terms of word accuracy on English and Dutch letter-phoneme conversion and morphological segmentation, and we provide qualitative analyses of error types prevented by the constraint satisfaction inference method.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128025693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
A Naive Theory of Affixation and an Algorithm for Extraction 一种朴素的词缀理论及提取算法
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622175
H. Hammarström
{"title":"A Naive Theory of Affixation and an Algorithm for Extraction","authors":"H. Hammarström","doi":"10.3115/1622165.1622175","DOIUrl":"https://doi.org/10.3115/1622165.1622175","url":null,"abstract":"We present a novel approach to the unsupervised detection of affixes, that is, to extract a set of salient prefixes and suffixes from an unlabeled corpus of a language. The underlying theory makes no assumptions on whether the language uses a lot of morphology or not, whether it is prefixing or suffixing, or whether affixes are long or short. It does however make the assumption that 1. salient affixes have to be frequent, i.e occur much more often that random segments of the same length, and that 2. words essentially are variable length sequences of random characters, e.g a character should not occur in far too many words than random without a reason, such as being part of a very frequent affix. The affix extraction algorithm uses only information from fluctation of frequencies, runs in linear time, and is free from thresholds and untransparent iterations. We demonstrate the usefulness of the approach with example case studies on typologically distant languages.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122880007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Exploring variant definitions of pointer length in MDL 探索MDL中指针长度的不同定义
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622170
Aris Xanthos, Yu Hu, J. Goldsmith
{"title":"Exploring variant definitions of pointer length in MDL","authors":"Aris Xanthos, Yu Hu, J. Goldsmith","doi":"10.3115/1622165.1622170","DOIUrl":"https://doi.org/10.3115/1622165.1622170","url":null,"abstract":"Within the information-theoretical framework described by (Rissanen, 1989; de Marcken, 1996; Goldsmith, 2001), pointers are used to avoid repetition of phonological material. Work with which we are familiar has assumed that there is only one way in which items could be pointed to. The purpose of this paper is to describe and compare several different methods, each of which satisfies MDL's basic requirements, but which have different consequences for the treatment of linguistic phenomena. In particular, we assess the conditions under which these different ways of pointing yield more compact descriptions of the data, both from a theoretical and an empirical perspective.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"8 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133238141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Morphology Induction from Limited Noisy Data Using Approximate String Matching 基于近似字符串匹配的有限噪声数据形态学诱导
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2006-06-08 DOI: 10.3115/1622165.1622173
Burcu Karagol-Ayan, D. Doermann, A. Weinberg
{"title":"Morphology Induction from Limited Noisy Data Using Approximate String Matching","authors":"Burcu Karagol-Ayan, D. Doermann, A. Weinberg","doi":"10.3115/1622165.1622173","DOIUrl":"https://doi.org/10.3115/1622165.1622173","url":null,"abstract":"For a language with limited resources, a dictionary may be one of the few available electronic resources. To make effective use of the dictionary for translation, however, users must be able to access it using the root form of morphologically deformed variant found in the text. Stemming and data driven methods, however, are not suitable when data is sparse. We present algorithms for discovering morphemes from limited, noisy data obtained by scanning a hard copy dictionary. Our approach is based on the novel application of the longest common substring and string edit distance metrics. Results show that these algorithms can in fact segment words into roots and affixes from the limited data contained in a dictionary, and extract affixes. This in turn allows non native speakers to perform multilingual tasks for applications where response must be rapid, and their knowledge is limited. In addition, this analysis can feed other NLP tools requiring lexicons.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121084971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Unsupervised Learning of Morphology for Building Lexicon for a Highly Inflectional Language 高屈折语词法的无监督学习
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 2002-07-11 DOI: 10.3115/1118647.1118648
U. Sharma, J. Kalita, R. Das
{"title":"Unsupervised Learning of Morphology for Building Lexicon for a Highly Inflectional Language","authors":"U. Sharma, J. Kalita, R. Das","doi":"10.3115/1118647.1118648","DOIUrl":"https://doi.org/10.3115/1118647.1118648","url":null,"abstract":"Words play a crucial role in aspects of natural language understanding such as syntactic and semantic processing. Usually, a natural language understanding system either already knows the words that appear in the text, or is able to automatically learn relevant information about a word upon encountering it. Usually, a capable system---human or machine, knows a subset of the entire vocabulary of a language and morphological rules to determine attributes of words not seen before. Developing a knowledge base of legal words and morphological rules is an important task in computational linguistics. In this paper, we describe initial experiments following an approach based on unsupervised learning of morphology from a text corpus, especially developed for this purpose. It is a method for conveniently creating a dictionary and a morphology rule base, and is, especially suitable for highly inflectional languages like Assamese. Assamese is a major Indian language of the Indic branch of the Indo-European family of languages. It is used by around 15 million people.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114689907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信