Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology最新文献

筛选
英文 中文
Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering SIGMORPHON 2021共享任务在无监督形态范式聚类中的发现
Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann
{"title":"Findings of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering","authors":"Adam Wiemerslage, Arya D. McCarthy, Alexander Erdmann, Garrett Nicolai, Manex Agirrezabal, Miikka Silfverberg, Mans Hulden, Katharina Kann","doi":"10.18653/v1/2021.sigmorphon-1.8","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.8","url":null,"abstract":"We describe the second SIGMORPHON shared task on unsupervised morphology: the goal of the SIGMORPHON 2021 Shared Task on Unsupervised Morphological Paradigm Clustering is to cluster word types from a raw text corpus into paradigms. To this end, we release corpora for 5 development and 9 test languages, as well as gold partial paradigms for evaluation. We receive 14 submissions from 4 teams that follow different strategies, and the best performing system is based on adaptor grammars. Results vary significantly across languages. However, all systems are outperformed by a supervised lemmatizer, implying that there is still room for improvement.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130346901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
An FST morphological analyzer for the Gitksan language 吉吉山语的FST形态学分析
C. Forbes, Garrett Nicolai, Miikka Silfverberg
{"title":"An FST morphological analyzer for the Gitksan language","authors":"C. Forbes, Garrett Nicolai, Miikka Silfverberg","doi":"10.18653/v1/2021.sigmorphon-1.21","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.21","url":null,"abstract":"This paper presents a finite-state morphological analyzer for the Gitksan language. The analyzer draws from a 1250-token Eastern dialect wordlist. It is based on finite-state technology and additionally includes two extensions which can provide analyses for out-of-vocabulary words: rules for generating predictable dialect variants, and a neural guesser component. The pre-neural analyzer, tested against interlinear-annotated texts from multiple dialects, achieves coverage of (75-81%), and maintains high precision (95-100%). The neural extension improves coverage at the cost of lowered precision.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133023819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Linguistic Knowledge in Multilingual Grapheme-to-Phoneme Conversion 多语言字素-音素转换中的语言知识
R. Lo, Garrett Nicolai
{"title":"Linguistic Knowledge in Multilingual Grapheme-to-Phoneme Conversion","authors":"R. Lo, Garrett Nicolai","doi":"10.18653/v1/2021.sigmorphon-1.15","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.15","url":null,"abstract":"This paper documents the UBC Linguistics team’s approach to the SIGMORPHON 2021 Grapheme-to-Phoneme Shared Task, concentrating on the low-resource setting. Our systems expand the baseline model with simple modifications informed by syllable structure and error analysis. In-depth investigation of test-set predictions shows that our best model rectifies a significant number of mistakes compared to the baseline prediction, besting all other submissions. Our results validate the view that careful error analysis in conjunction with linguistic knowledge can lead to more effective computational modeling.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115744897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Were We There Already? Applying Minimal Generalization to the SIGMORPHON-UniMorph Shared Task on Cognitively Plausible Morphological Inflection 我们已经到了吗?最小概化在认知似是而非形态屈折SIGMORPHON-UniMorph共享任务中的应用
Colin Wilson, Jane S.Y. Li
{"title":"Were We There Already? Applying Minimal Generalization to the SIGMORPHON-UniMorph Shared Task on Cognitively Plausible Morphological Inflection","authors":"Colin Wilson, Jane S.Y. Li","doi":"10.18653/v1/2021.sigmorphon-1.29","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.29","url":null,"abstract":"Morphological rules with various levels of specificity can be learned from example lexemes by recursive application of minimal generalization (Albright and Hayes, 2002, 2003). A model that learns rules solely through minimal generalization was used to predict average human wug-test ratings from German, English, and Dutch in the SIGMORPHONUniMorph 2021 Shared Task, with competitive results. Some formal properties of the minimal generalization operation were proved,experimentalntially pruned. An automatic method was developed to create wugtest stimuli for future experiments that investigate whether the model’s morphological generalizations are too minimal.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"7 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132531031","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Unsupervised Paradigm Clustering Using Transformation Rules 使用转换规则的无监督范式聚类
Changbing Yang, Garrett Nicolai, Miikka Silfverberg
{"title":"Unsupervised Paradigm Clustering Using Transformation Rules","authors":"Changbing Yang, Garrett Nicolai, Miikka Silfverberg","doi":"10.18653/v1/2021.sigmorphon-1.11","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.11","url":null,"abstract":"This paper describes the submission of the CU-UBC team for the SIGMORPHON 2021 Shared Task 2: Unsupervised morphological paradigm clustering. Our system generates paradigms using morphological transformation rules which are discovered from raw data. We experiment with two methods for discovering rules. Our first approach generates prefix and suffix transformations between similar strings. Secondly, we experiment with more general rules which can apply transformations inside the input strings in addition to prefix and suffix transformations. We find that the best overall performance is delivered by prefix and suffix rules but more general transformation rules perform better for languages with templatic morphology and very high morpheme-to-word ratios.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121372626","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The Match-Extend serialization algorithm in Multiprecedence Multiprecedence中的Match-Extend序列化算法
Maxime Papillon
{"title":"The Match-Extend serialization algorithm in Multiprecedence","authors":"Maxime Papillon","doi":"10.18653/v1/2021.sigmorphon-1.3","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.3","url":null,"abstract":"Raimy (1999; 2000a; 2000b) proposed a graphical formalism for modeling reduplication, originallymostly focused on phonological overapplication in a derivational framework. This framework is now known as Precedence-based phonology or Multiprecedence phonology. Raimy’s idea is that the segments at the input to the phonology are not totally ordered by precedence. This paper tackles a challenge that arose with Raimy’s work, the development of a deterministic serialization algorithm as part of the derivation of surface forms. The Match-Extend algorithm introduced here requires fewer assumptions and sticks tighter to the attested typology. The algorithm also contains no parameter or constraint specific to individual graphs or topologies, unlike previous proposals. Match-Extend requires nothing except knowing the last added set of links.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116864585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Study of Morphological Robustness of Neural Machine Translation 神经机器翻译的形态鲁棒性研究
Sai Muralidhar Jayanthi, Adithya Pratapa
{"title":"A Study of Morphological Robustness of Neural Machine Translation","authors":"Sai Muralidhar Jayanthi, Adithya Pratapa","doi":"10.18653/v1/2021.sigmorphon-1.6","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.6","url":null,"abstract":"In this work, we analyze the robustness of neural machine translation systems towards grammatical perturbations in the source. In particular, we focus on morphological inflection related perturbations. While this has been recently studied for English→French (MORPHEUS) (Tan et al., 2020), it is unclear how this extends to Any→English translation systems. We propose MORPHEUS-MULTILINGUAL that utilizes UniMorph dictionaries to identify morphological perturbations to source that adversely affect the translation models. Along with an analysis of state-of-the-art pretrained MT systems, we train and analyze systems for 11 language pairs using the multilingual TED corpus (Qi et al., 2018). We also compare this to actual errors of non-native speakers using Grammatical Error Correction datasets. Finally, we present a qualitative and quantitative analysis of the robustness of Any→English translation systems.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"38 10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116801191","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Training Strategies for Neural Multilingual Morphological Inflection 神经多语形态屈折的训练策略
Adam Ek, Jean-Philippe Bernardy
{"title":"Training Strategies for Neural Multilingual Morphological Inflection","authors":"Adam Ek, Jean-Philippe Bernardy","doi":"10.18653/v1/2021.sigmorphon-1.26","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.26","url":null,"abstract":"This paper presents the submission of team GUCLASP to SIGMORPHON 2021 Shared Task on Generalization in Morphological Inflection Generation. We develop a multilingual model for Morphological Inflection and primarily focus on improving the model by using various training strategies to improve accuracy and generalization across languages.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121792049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages 形态反射的共享任务:跨语言的泛化
Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Charbel El-Khaissi, Omer Goldman, M. Gasser, William Lane, M. Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, A. Shcherbakov, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, E. Klyachko, A. Salehi, A. A. Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, A. Salchak, Christopher A. Straughn, Zoey Liu, J. North, Duygu Ataman, Witold Kieraś, Marcin Woliński, T. Suhardijanto, Niklas Stoehr, Z. Nuriah, S. Ratan, Francis M. Tyers, E. M. Ponti, Grant Aiton, R. Hatcher, Ritesh Kumar, Mans Hulden, B. Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohith S Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, Ekaterina Vylomova
{"title":"SIGMORPHON 2021 Shared Task on Morphological Reinflection: Generalization Across Languages","authors":"Tiago Pimentel, Maria Ryskina, Sabrina J. Mielke, Shijie Wu, Eleanor Chodroff, Brian Leonard, Garrett Nicolai, Yustinus Ghanggo Ate, Salam Khalifa, Charbel El-Khaissi, Omer Goldman, M. Gasser, William Lane, M. Coler, Arturo Oncevay, Jaime Rafael Montoya Samame, Gema Celeste Silva Villegas, Adam Ek, Jean-Philippe Bernardy, A. Shcherbakov, Karina Sheifer, Sofya Ganieva, Matvey Plugaryov, E. Klyachko, A. Salehi, A. A. Krizhanovsky, Natalia Krizhanovsky, Clara Vania, Sardana Ivanova, A. Salchak, Christopher A. Straughn, Zoey Liu, J. North, Duygu Ataman, Witold Kieraś, Marcin Woliński, T. Suhardijanto, Niklas Stoehr, Z. Nuriah, S. Ratan, Francis M. Tyers, E. M. Ponti, Grant Aiton, R. Hatcher, Ritesh Kumar, Mans Hulden, B. Barta, Dorina Lakatos, Gábor Szolnok, Judit Ács, Mohith S Raj, David Yarowsky, Ryan Cotterell, Ben Ambridge, Ekaterina Vylomova","doi":"10.18653/v1/2021.sigmorphon-1.25","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.25","url":null,"abstract":"This year’s iteration of the SIGMORPHON Shared Task on morphological reinflection focuses on typological diversity and cross-lingual variation of morphosyntactic features. In terms of the task, we enrich UniMorph with new data for 32 languages from 13 language families, with most of them being under-resourced: Kunwinjku, Classical Syriac, Arabic (Modern Standard, Egyptian, Gulf), Hebrew, Amharic, Aymara, Magahi, Braj, Kurdish (Central, Northern, Southern), Polish, Karelian, Livvi, Ludic, Veps, Võro, Evenki, Xibe, Tuvan, Sakha, Turkish, Indonesian, Kodi, Seneca, Asháninka, Yanesha, Chukchi, Itelmen, Eibela. We evaluate six","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123222297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 33
Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent 用随机梯度下降法简单归纳语音战术(确定性)概率有限状态自动机
Huteng Dai, Richard Futrell
{"title":"Simple induction of (deterministic) probabilistic finite-state automata for phonotactics by stochastic gradient descent","authors":"Huteng Dai, Richard Futrell","doi":"10.18653/v1/2021.sigmorphon-1.19","DOIUrl":"https://doi.org/10.18653/v1/2021.sigmorphon-1.19","url":null,"abstract":"We introduce a simple and highly general phonotactic learner which induces a probabilistic finite-state automaton from word-form data. We describe the learner and show how to parameterize it to induce unrestricted regular languages, as well as how to restrict it to certain subregular classes such as Strictly k-Local and Strictly k-Piecewise languages. We evaluate the learner on its ability to learn phonotactic constraints in toy examples and in datasets of Quechua and Navajo. We find that an unrestricted learner is the most accurate overall when modeling attested forms not seen in training; however, only the learner restricted to the Strictly Piecewise language class successfully captures certain nonlocal phonotactic constraints. Our learner serves as a baseline for more sophisticated methods.","PeriodicalId":187165,"journal":{"name":"Proceedings of the 18th SIGMORPHON Workshop on Computational Research in Phonetics, Phonology, and Morphology","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117319915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信