Special Interest Group on Computational Morphology and Phonology Workshop最新文献

筛选
英文 中文
SIGMORPHON–UniMorph 2023 Shared Task 0: Typologically Diverse Morphological Inflection 不同类型的形态学变化
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.13
Omer Goldman, Khuyagbaatar Batsuren, Salam Khalifa, Aryaman Arora, Garrett Nicolai, Reut Tsarfaty, Ekaterina Vylomova
{"title":"SIGMORPHON–UniMorph 2023 Shared Task 0: Typologically Diverse Morphological Inflection","authors":"Omer Goldman, Khuyagbaatar Batsuren, Salam Khalifa, Aryaman Arora, Garrett Nicolai, Reut Tsarfaty, Ekaterina Vylomova","doi":"10.18653/v1/2023.sigmorphon-1.13","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.13","url":null,"abstract":"The 2023 SIGMORPHON–UniMorph shared task on typologically diverse morphological inflection included a wide range of languages: 26 languages from 9 primary language families. The data this year was all lemma-split, to allow testing models’ generalization ability, and structured along the new hierarchical schema presented in (Batsuren et al., 2022). The systems submitted this year, 9 in number, showed ingenuity and innovativeness, including hard attention for explainability and bidirectional decoding. Special treatment was also given by many participants to the newly-introduced data in Japanese, due to the high abundance of unseen Kanji characters in its test set.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122427800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Using longest common subsequence and character models to predict word forms 使用最长公共子序列和字符模型来预测词形
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/W16-2009
A. Sorokin
{"title":"Using longest common subsequence and character models to predict word forms","authors":"A. Sorokin","doi":"10.18653/v1/W16-2009","DOIUrl":"https://doi.org/10.18653/v1/W16-2009","url":null,"abstract":"This paper presents an algorithm for automatic word forms inflection. We use the method of longest common subsequence to extract abstract paradigms from given pairs of basic and inflected word forms, as well as suffix and prefix features to predict this paradigm automatically. We elaborate this algorithm using combination of affix feature-based and character ngram models, which substantially enhances performance especially for the languages possessing nonlocal phenomena such as vowel harmony. Our system took part in SIGMORPHON 2016 Shared Task and took 3rd place in 17 of 30 subtasks and 4th place in 7 substasks among 7 participants.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131330208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
The SIGMORPHON 2022 Shared Task on Cross-lingual and Low-Resource Grapheme-to-Phoneme Conversion 跨语言和低资源字素到音素转换的SIGMORPHON 2022共享任务
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.27
Arya D. McCarthy, Jackson L. Lee, Alexandra DeLucia, Travis M. Bartley, M. Agarwal, Lucas F. E. Ashby, L. Signore, Cameron Gibson, R. Raff, Winston Wu
{"title":"The SIGMORPHON 2022 Shared Task on Cross-lingual and Low-Resource Grapheme-to-Phoneme Conversion","authors":"Arya D. McCarthy, Jackson L. Lee, Alexandra DeLucia, Travis M. Bartley, M. Agarwal, Lucas F. E. Ashby, L. Signore, Cameron Gibson, R. Raff, Winston Wu","doi":"10.18653/v1/2023.sigmorphon-1.27","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.27","url":null,"abstract":"Grapheme-to-phoneme conversion is an important component in many speech technologies, but until recently there were no multilingual benchmarks for this task. The third iteration of the SIGMORPHON shared task on multilingual grapheme-to-phoneme conversion features many improvements from the previous year’s task (Ashby et al., 2021), including additional languages, three subtasks varying the amount of available resources, extensive quality assurance procedures, and automated error analyses. Three teams submitted a total of fifteen systems, at best achieving relative reductions of word error rate of 14% in the crosslingual subtask and 14% in the very-low resource subtask. The generally consistent result is that cross-lingual transfer substantially helps grapheme-to-phoneme modeling, but not to the same degree as in-language examples.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129884903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Morphological Segmentation Can Improve Syllabification 形态切分可以改善音节化
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/W16-2016
Garrett Nicolai, Lei Yao, Grzegorz Kondrak
{"title":"Morphological Segmentation Can Improve Syllabification","authors":"Garrett Nicolai, Lei Yao, Grzegorz Kondrak","doi":"10.18653/v1/W16-2016","DOIUrl":"https://doi.org/10.18653/v1/W16-2016","url":null,"abstract":"Syllabification is sometimes influenced by morphological boundaries. We show that incorporating morphological information can improve the accuracy of orthographic syllabification in English and German. Surprisingly, unsupervised segmenters, such as Morfessor, can be more useful for this purpose than the supervised ones.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121354637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Morphotactics as Tier-Based Strictly Local Dependencies 作为基于层的严格局部依赖的形态策略
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/W16-2019
Alëna Aksënova, T. Graf, S. Moradi
{"title":"Morphotactics as Tier-Based Strictly Local Dependencies","authors":"Alëna Aksënova, T. Graf, S. Moradi","doi":"10.18653/v1/W16-2019","DOIUrl":"https://doi.org/10.18653/v1/W16-2019","url":null,"abstract":"It is commonly accepted that morphological dependencies are finite-state in nature. We argue that the upper bound on morphological expressivity is much lower. Drawing on technical results from computational phonology, we show that a variety of morphotactic phenomena are tierbased strictly local and do not fall into weaker subclasses such as the strictly local or strictly piecewise languages. Since the tier-based strictly local languages are learnable in the limit from positive texts, this marks a first important step towards general machine learning algorithms for morphology. Furthermore, the limitation to tier-based strictly local languages explains typological gaps that are puzzling from a purely linguistic perspective.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122975938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Low-resource grapheme-to-phoneme mapping with phonetically-conditioned transfer 具有语音条件迁移的低资源字素到音素映射
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.29
Michael Hammond
{"title":"Low-resource grapheme-to-phoneme mapping with phonetically-conditioned transfer","authors":"Michael Hammond","doi":"10.18653/v1/2023.sigmorphon-1.29","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.29","url":null,"abstract":"In this paper we explore a very simple nonneural approach to mapping orthography to phonetic transcription in a low-resource context with transfer data from a related language. We start from a baseline system and focus our efforts on data augmentation. We make three principal moves. First, we start with an HMMbased system (Novak et al., 2012). Second, we augment our basic system by recombining legal substrings in restricted fashion (Ryan and Hulden, 2020). Finally, we limit our transfer data by only using training pairs where the phonetic form shares all bigrams with the target language.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127289786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Findings of the SIGMORPHON 2023 Shared Task on Interlinear Glossing SIGMORPHON 2023共享任务在行间光泽上的发现
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.20
Michael Ginn, Sarah Moeller, Alexis Palmer, Anna Stacey, Garrett Nicolai, Mans Hulden, Miikka Silfverberg
{"title":"Findings of the SIGMORPHON 2023 Shared Task on Interlinear Glossing","authors":"Michael Ginn, Sarah Moeller, Alexis Palmer, Anna Stacey, Garrett Nicolai, Mans Hulden, Miikka Silfverberg","doi":"10.18653/v1/2023.sigmorphon-1.20","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.20","url":null,"abstract":"This paper presents the findings of the SIGMORPHON 2023 Shared Task on Interlinear Glossing. This first iteration of the shared task explores glossing of a set of six typologically diverse languages: Arapaho, Gitksan, Lezgi, Natügu, Tsez and Uspanteko. The shared task encompasses two tracks: a resource-scarce closed track and an open track, where participants are allowed to utilize external data resources. Five teams participated in the shared task. The winning team Tü-CL achieved a 23.99%-point improvement over a baseline RoBERTa system in the closed track and a 17.42%-point improvement in the open track.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122815172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
An Ensembled Encoder-Decoder System for Interlinear Glossed Text 行间有光文本的集成编码器-解码器系统
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.23
Edith Coates
{"title":"An Ensembled Encoder-Decoder System for Interlinear Glossed Text","authors":"Edith Coates","doi":"10.18653/v1/2023.sigmorphon-1.23","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.23","url":null,"abstract":"This paper presents my submission to Track 1 of the 2023 SIGMORPHON shared task on interlinear glossed text (IGT). There are a wide amount of techniques for building and training IGT models (see Moeller and Hulden, 2018; McMillan-Major, 2020; Zhao et al., 2020). I describe my ensembled sequence-to-sequence approach, perform experiments, and share my submission’s test-set accuracy. I also discuss future areas of research in low-resource token classification methods for IGT.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126454721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SIGMORPHON–UniMorph 2023 Shared Task 0, Part 2: Cognitively Plausible Morphophonological Generalization in Korean 韩文语音学的认知似是而非
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.14
Canaan Breiss, Jinyoung Jo
{"title":"SIGMORPHON–UniMorph 2023 Shared Task 0, Part 2: Cognitively Plausible Morphophonological Generalization in Korean","authors":"Canaan Breiss, Jinyoung Jo","doi":"10.18653/v1/2023.sigmorphon-1.14","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.14","url":null,"abstract":"This paper summarises data collection and curation for Part 2 of the 2023 SIGMORPHON-UniMorph Shared Task 0, which focused on modeling speaker knowledge and generalization of a pair of interacting phonological processes in Korean. We briefly describe how modeling the generalization task could be of interest to researchers in both Natural Language Processing and linguistics, and then summarise the traditional description of the phonological processes that are at the center of the modeling challenge. We then describe the criteria we used to select and code cases of process application in two Korean speech corpora, which served as the primary learning data. We also report the technical details of the experiment we carried out that served as the primary test data.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134630502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Linear Discriminative Learning: a competitive non-neural baseline for morphological inflection 线性判别学习:形态变化的竞争性非神经基线
Special Interest Group on Computational Morphology and Phonology Workshop Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.sigmorphon-1.16
Cheon-Yeong Jeong, Dominic Schmitz, Akhilesh Kakolu Ramarao, Anna Stein, Kevin Tang
{"title":"Linear Discriminative Learning: a competitive non-neural baseline for morphological inflection","authors":"Cheon-Yeong Jeong, Dominic Schmitz, Akhilesh Kakolu Ramarao, Anna Stein, Kevin Tang","doi":"10.18653/v1/2023.sigmorphon-1.16","DOIUrl":"https://doi.org/10.18653/v1/2023.sigmorphon-1.16","url":null,"abstract":"This paper presents our submission to the SIGMORPHON 2023 task 2 of Cognitively Plausible Morphophonological Generalization in Korean. We implemented both Linear Discriminative Learning and Transformer models and found that the Linear Discriminative Learning model trained on a combination of corpus and experimental data showed the best performance with the overall accuracy of around 83%. We found that the best model must be trained on both corpus data and the experimental data of one particular participant. Our examination of speaker-variability and speaker-specific information did not explain why a particular participant combined well with the corpus data. We recommend Linear Discriminative Learning models as a future non-neural baseline system, owning to its training speed, accuracy, model interpretability and cognitive plausibility. In order to improve the model performance, we suggest using bigger data and/or performing data augmentation and incorporating speaker- and item-specifics considerably.","PeriodicalId":186158,"journal":{"name":"Special Interest Group on Computational Morphology and Phonology Workshop","volume":"55 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132287091","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信