{"title":"作为CLDF数据集的南岛语和密克罗尼西亚语比较词典。","authors":"Alexander D Smith, Robert Forkel, Lev Blumenfeld","doi":"10.1038/s41597-025-05301-4","DOIUrl":null,"url":null,"abstract":"<p><p>The Austronesian Comparative Dictionary has served as an important resource for the comparative study of Austronesian languages since Robert Blust started its compilation in 1990. Likewise, the Micronesian Comparative Dictionary - an online database of Proto-Micronesian Reconstructions previously published in Oceanic Linguistics by Byron Bender and colleagues - is an important reference point for comparative Linguistics. The legacy, online versions of both dictionaries share an uncertain future, and both have not been available in a structured format, amenable to quantitative methods. Thus, to preserve the content of both dictionaries for the scientific record and to increase interoperability of the data, we undertook a conversion of the dictionaries to CLDF datasets. While programmatic access to the data within each dictionary already provides a new level of usability, the true potential of data in CLDF lies in interoperability across datasets. This is particularly useful for the two dictionaries presented here, because Micronesian languages belong to the Austronesian family and so the Micronesian data could potentially complement the Austronesian Comparative Dictionary. With the CLDF datasets we lay the groundwork for tackling this challenge.</p>","PeriodicalId":21597,"journal":{"name":"Scientific Data","volume":"12 1","pages":"1015"},"PeriodicalIF":6.9000,"publicationDate":"2025-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12174365/pdf/","citationCount":"0","resultStr":"{\"title\":\"The Austronesian and the Micronesian Comparative Dictionaries as CLDF datasets.\",\"authors\":\"Alexander D Smith, Robert Forkel, Lev Blumenfeld\",\"doi\":\"10.1038/s41597-025-05301-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>The Austronesian Comparative Dictionary has served as an important resource for the comparative study of Austronesian languages since Robert Blust started its compilation in 1990. Likewise, the Micronesian Comparative Dictionary - an online database of Proto-Micronesian Reconstructions previously published in Oceanic Linguistics by Byron Bender and colleagues - is an important reference point for comparative Linguistics. The legacy, online versions of both dictionaries share an uncertain future, and both have not been available in a structured format, amenable to quantitative methods. Thus, to preserve the content of both dictionaries for the scientific record and to increase interoperability of the data, we undertook a conversion of the dictionaries to CLDF datasets. While programmatic access to the data within each dictionary already provides a new level of usability, the true potential of data in CLDF lies in interoperability across datasets. This is particularly useful for the two dictionaries presented here, because Micronesian languages belong to the Austronesian family and so the Micronesian data could potentially complement the Austronesian Comparative Dictionary. With the CLDF datasets we lay the groundwork for tackling this challenge.</p>\",\"PeriodicalId\":21597,\"journal\":{\"name\":\"Scientific Data\",\"volume\":\"12 1\",\"pages\":\"1015\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-06-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12174365/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Data\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41597-025-05301-4\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Data","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41597-025-05301-4","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
The Austronesian and the Micronesian Comparative Dictionaries as CLDF datasets.
The Austronesian Comparative Dictionary has served as an important resource for the comparative study of Austronesian languages since Robert Blust started its compilation in 1990. Likewise, the Micronesian Comparative Dictionary - an online database of Proto-Micronesian Reconstructions previously published in Oceanic Linguistics by Byron Bender and colleagues - is an important reference point for comparative Linguistics. The legacy, online versions of both dictionaries share an uncertain future, and both have not been available in a structured format, amenable to quantitative methods. Thus, to preserve the content of both dictionaries for the scientific record and to increase interoperability of the data, we undertook a conversion of the dictionaries to CLDF datasets. While programmatic access to the data within each dictionary already provides a new level of usability, the true potential of data in CLDF lies in interoperability across datasets. This is particularly useful for the two dictionaries presented here, because Micronesian languages belong to the Austronesian family and so the Micronesian data could potentially complement the Austronesian Comparative Dictionary. With the CLDF datasets we lay the groundwork for tackling this challenge.
期刊介绍:
Scientific Data is an open-access journal focused on data, publishing descriptions of research datasets and articles on data sharing across natural sciences, medicine, engineering, and social sciences. Its goal is to enhance the sharing and reuse of scientific data, encourage broader data sharing, and acknowledge those who share their data.
The journal primarily publishes Data Descriptors, which offer detailed descriptions of research datasets, including data collection methods and technical analyses validating data quality. These descriptors aim to facilitate data reuse rather than testing hypotheses or presenting new interpretations, methods, or in-depth analyses.