使用r的语料库:一个带有印尼语否定结构的介绍性注释

Gede Primahadi Wijaya Rajeg, Karlina Denistia, I. M. Rajeg
{"title":"使用r的语料库:一个带有印尼语否定结构的介绍性注释","authors":"Gede Primahadi Wijaya Rajeg, Karlina Denistia, I. M. Rajeg","doi":"10.26499/LI.V36I1.71","DOIUrl":null,"url":null,"abstract":"This paper demonstrates the use of R for a unified data science in corpus linguistics via a series of corpus-based analyses on Indonesian Negating Construction. The data is based on c17-million word-tokens of an online-news corpus, a part of the Indonesian Leipzig Corpora. We identified that tidak is the most frequent form in our corpus. Next, we found that tak has significantly higher type frequency for negated-predicates with [ter-X-kan] schema compared to tidak; this finding provides a quantitative nuance against a description in an Indonesian reference grammar, stating that (i) in present-day Indonesian tidak is also common to negate ter- related predicates, while (ii) the compulsoriness of tak to negate ter- predicates is a past usage. Lastly, we refine our second finding by applying Distinctive Collexeme Analysis to determine that tak strongly attracts specific verbs predominantly in the [ter-X-kan] schema compared to tidak; this finding offers a deeper characterisation for tidak and tak.","PeriodicalId":221379,"journal":{"name":"Linguistik Indonesia","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"WORKING WITH A LINGUISTIC CORPUS USING R: AN INTRODUCTORY NOTE WITH INDONESIAN NEGATING CONSTRUCTION\",\"authors\":\"Gede Primahadi Wijaya Rajeg, Karlina Denistia, I. M. Rajeg\",\"doi\":\"10.26499/LI.V36I1.71\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper demonstrates the use of R for a unified data science in corpus linguistics via a series of corpus-based analyses on Indonesian Negating Construction. The data is based on c17-million word-tokens of an online-news corpus, a part of the Indonesian Leipzig Corpora. We identified that tidak is the most frequent form in our corpus. Next, we found that tak has significantly higher type frequency for negated-predicates with [ter-X-kan] schema compared to tidak; this finding provides a quantitative nuance against a description in an Indonesian reference grammar, stating that (i) in present-day Indonesian tidak is also common to negate ter- related predicates, while (ii) the compulsoriness of tak to negate ter- predicates is a past usage. Lastly, we refine our second finding by applying Distinctive Collexeme Analysis to determine that tak strongly attracts specific verbs predominantly in the [ter-X-kan] schema compared to tidak; this finding offers a deeper characterisation for tidak and tak.\",\"PeriodicalId\":221379,\"journal\":{\"name\":\"Linguistik Indonesia\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Linguistik Indonesia\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.26499/LI.V36I1.71\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Linguistik Indonesia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26499/LI.V36I1.71","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文通过对印尼语否定结构的一系列基于语料库的分析,展示了R在语料库语言学中统一数据科学的使用。这些数据是基于在线新闻语料库中的c1700万个单词标记,该语料库是印度尼西亚莱比锡语料库的一部分。我们发现tidak是我们语料库中最常见的形式。其次,我们发现tak对[ter-X-kan]模式的否定谓词的类型频率显著高于tidak;这一发现为印尼语参考语法中的描述提供了数量上的细微差别,说明(i)在今天的印尼语中,否定关联谓词也很常见,而(ii)否定关联谓词的强制性是过去的用法。最后,我们通过应用独特的词素分析来完善我们的第二个发现,以确定与tidak相比,tak强烈吸引了[ter-X-kan]图式中的特定动词;这一发现为潮汐和tak提供了更深层次的特征。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
WORKING WITH A LINGUISTIC CORPUS USING R: AN INTRODUCTORY NOTE WITH INDONESIAN NEGATING CONSTRUCTION
This paper demonstrates the use of R for a unified data science in corpus linguistics via a series of corpus-based analyses on Indonesian Negating Construction. The data is based on c17-million word-tokens of an online-news corpus, a part of the Indonesian Leipzig Corpora. We identified that tidak is the most frequent form in our corpus. Next, we found that tak has significantly higher type frequency for negated-predicates with [ter-X-kan] schema compared to tidak; this finding provides a quantitative nuance against a description in an Indonesian reference grammar, stating that (i) in present-day Indonesian tidak is also common to negate ter- related predicates, while (ii) the compulsoriness of tak to negate ter- predicates is a past usage. Lastly, we refine our second finding by applying Distinctive Collexeme Analysis to determine that tak strongly attracts specific verbs predominantly in the [ter-X-kan] schema compared to tidak; this finding offers a deeper characterisation for tidak and tak.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信