风中吹：使用“北风与太阳”文本对音素清单进行采样

IF 0.8 3区文学 0 LANGUAGE & LINGUISTICS

Journal of the International Phonetic Association Pub Date : 2021-06-07 DOI:10.1017/S002510032000033X

Louise Baird, Nick Evans, Simon J. Greenhill

{"title":"风中吹：使用“北风与太阳”文本对音素清单进行采样","authors":"Louise Baird, Nick Evans, Simon J. Greenhill","doi":"10.1017/S002510032000033X","DOIUrl":null,"url":null,"abstract":"Language documentation faces a persistent and pervasive problem: How much material is enough to represent a language fully? How much text would we need to sample the full phoneme inventory of a language? In the phonetic/phonemic domain, what proportion of the phoneme inventory can we expect to sample in a text of a given length? Answering these questions in a quantifiable way is tricky, but asking them is necessary. The cumulative collection of Illustrative Texts published in the Illustration series in this journal over more than four decades (mostly renditions of the ‘North Wind and the Sun’) gives us an ideal dataset for pursuing these questions. Here we investigate a tractable subset of the above questions, namely: What proportion of a language’s phoneme inventory do these texts enable us to recover, in the minimal sense of having at least one allophone of each phoneme? We find that, even with this low bar, only three languages (Modern Greek, Shipibo and the Treger dialect of Breton) attest all phonemes in these texts. Unsurprisingly, these languages sit at the low end of phoneme inventory sizes (respectively 23, 24 and 36 phonemes). We then estimate the rate at which phonemes are sampled in the Illustrative Texts and extrapolate to see how much text it might take to display a language’s full inventory. Finally, we discuss the implications of these findings for linguistics in its quest to represent the world’s phonetic diversity, and for JIPA in its design requirements for Illustrations and in particular whether supplementary panphonic texts should be included.","PeriodicalId":46444,"journal":{"name":"Journal of the International Phonetic Association","volume":"52 1","pages":"453 - 494"},"PeriodicalIF":0.8000,"publicationDate":"2021-06-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1017/S002510032000033X","citationCount":"5","resultStr":"{\"title\":\"Blowing in the wind: Using ‘North Wind and the Sun’ texts to sample phoneme inventories\",\"authors\":\"Louise Baird, Nick Evans, Simon J. Greenhill\",\"doi\":\"10.1017/S002510032000033X\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Language documentation faces a persistent and pervasive problem: How much material is enough to represent a language fully? How much text would we need to sample the full phoneme inventory of a language? In the phonetic/phonemic domain, what proportion of the phoneme inventory can we expect to sample in a text of a given length? Answering these questions in a quantifiable way is tricky, but asking them is necessary. The cumulative collection of Illustrative Texts published in the Illustration series in this journal over more than four decades (mostly renditions of the ‘North Wind and the Sun’) gives us an ideal dataset for pursuing these questions. Here we investigate a tractable subset of the above questions, namely: What proportion of a language’s phoneme inventory do these texts enable us to recover, in the minimal sense of having at least one allophone of each phoneme? We find that, even with this low bar, only three languages (Modern Greek, Shipibo and the Treger dialect of Breton) attest all phonemes in these texts. Unsurprisingly, these languages sit at the low end of phoneme inventory sizes (respectively 23, 24 and 36 phonemes). We then estimate the rate at which phonemes are sampled in the Illustrative Texts and extrapolate to see how much text it might take to display a language’s full inventory. Finally, we discuss the implications of these findings for linguistics in its quest to represent the world’s phonetic diversity, and for JIPA in its design requirements for Illustrations and in particular whether supplementary panphonic texts should be included.\",\"PeriodicalId\":46444,\"journal\":{\"name\":\"Journal of the International Phonetic Association\",\"volume\":\"52 1\",\"pages\":\"453 - 494\"},\"PeriodicalIF\":0.8000,\"publicationDate\":\"2021-06-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://sci-hub-pdf.com/10.1017/S002510032000033X\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of the International Phonetic Association\",\"FirstCategoryId\":\"98\",\"ListUrlMain\":\"https://doi.org/10.1017/S002510032000033X\",\"RegionNum\":3,\"RegionCategory\":\"文学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"0\",\"JCRName\":\"LANGUAGE & LINGUISTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the International Phonetic Association","FirstCategoryId":"98","ListUrlMain":"https://doi.org/10.1017/S002510032000033X","RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 5

摘要

语言文档面临着一个持久而普遍的问题：有多少材料足以完全表示一种语言？我们需要多少文本才能对一种语言的完整音素清单进行采样？在语音/音素领域，我们可以期望在给定长度的文本中采样多大比例的音素清单？以可量化的方式回答这些问题很棘手，但提出这些问题是必要的。40多年来，本杂志插图系列中发表的插图文本的累积集（主要是“北风和太阳”的再现）为我们提供了一个理想的数据集来研究这些问题。在这里，我们研究了上述问题的一个可处理的子集，即：在每个音素至少有一个异体音的最小意义上，这些文本使我们能够恢复一种语言音素清单的多大比例？我们发现，即使有这么低的门槛，也只有三种语言（现代希腊语、希皮波语和布列塔尼的特雷格方言）证明了这些文本中的所有音位。不出所料，这些语言处于音素清单大小的低端（分别为23、24和36个音素）。然后，我们估计说明性文本中音素的采样率，并推断出显示一种语言的完整清单可能需要多少文本。最后，我们讨论了这些发现对语言学寻求代表世界语音多样性的影响，以及对JIPA插图设计要求的影响，特别是是否应包括补充泛音文本。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Blowing in the wind: Using ‘North Wind and the Sun’ texts to sample phoneme inventories

Language documentation faces a persistent and pervasive problem: How much material is enough to represent a language fully? How much text would we need to sample the full phoneme inventory of a language? In the phonetic/phonemic domain, what proportion of the phoneme inventory can we expect to sample in a text of a given length? Answering these questions in a quantifiable way is tricky, but asking them is necessary. The cumulative collection of Illustrative Texts published in the Illustration series in this journal over more than four decades (mostly renditions of the ‘North Wind and the Sun’) gives us an ideal dataset for pursuing these questions. Here we investigate a tractable subset of the above questions, namely: What proportion of a language’s phoneme inventory do these texts enable us to recover, in the minimal sense of having at least one allophone of each phoneme? We find that, even with this low bar, only three languages (Modern Greek, Shipibo and the Treger dialect of Breton) attest all phonemes in these texts. Unsurprisingly, these languages sit at the low end of phoneme inventory sizes (respectively 23, 24 and 36 phonemes). We then estimate the rate at which phonemes are sampled in the Illustrative Texts and extrapolate to see how much text it might take to display a language’s full inventory. Finally, we discuss the implications of these findings for linguistics in its quest to represent the world’s phonetic diversity, and for JIPA in its design requirements for Illustrations and in particular whether supplementary panphonic texts should be included.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of the International Phonetic Association Multiple-

CiteScore

2.10

自引率

12.50%

发文量

期刊介绍： The Journal of the International Phonetic Association (JIPA) is a forum for work in the fields of phonetic theory and description. As well as including papers on laboratory phonetics/phonology and related topics, the journal encourages submissions on practical applications of phonetics to areas such as phonetics teaching and speech therapy, as well as the analysis of speech phenomena in relation to computer speech processing. It is especially concerned with the theory behind the International Phonetic Alphabet and discussions of the use of symbols for illustrating the phonetic structures of a wide variety of languages. JIPA now publishes online audio files to supplement written articles Published for the International Phonetic Association