罗马尼亚语不同语料库常用词研究

2014 10th International Conference on Communications (COMM) Pub Date : 2014-07-28 DOI:10.1109/ICCOMM.2014.6866729

A. Mitrea, A. Vlad, Octavian Hodea, R. Dragomir

{"title":"罗马尼亚语不同语料库常用词研究","authors":"A. Mitrea, A. Vlad, Octavian Hodea, R. Dragomir","doi":"10.1109/ICCOMM.2014.6866729","DOIUrl":null,"url":null,"abstract":"The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.","PeriodicalId":366043,"journal":{"name":"2014 10th International Conference on Communications (COMM)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A study on the common words found in different literary Romanian corpora\",\"authors\":\"A. Mitrea, A. Vlad, Octavian Hodea, R. Dragomir\",\"doi\":\"10.1109/ICCOMM.2014.6866729\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.\",\"PeriodicalId\":366043,\"journal\":{\"name\":\"2014 10th International Conference on Communications (COMM)\",\"volume\":\"42 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-07-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 10th International Conference on Communications (COMM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCOMM.2014.6866729\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 10th International Conference on Communications (COMM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCOMM.2014.6866729","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

实验分析集中在文学作品的语料库上——小说和短篇故事(107本)——包括大约1270万字，通过将两个长度相似的不同作品集合在一起获得。主要目标是确定构成整个语料库的每本书中出现的常用词，在每个作者组织的每个子语料库中以及在每个文本子集合中出现的常用词。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A study on the common words found in different literary Romanian corpora

The experimental analysis focused on a corpus of literary works - novels and short stories (107 books) - comprising about 12.7 million words, obtained by bringing together two different collections of writings of similar length. The main objective was to identify the common words occurring in each of the books constituting the corpus as a whole, in each of the sub-corpora organized per author and also in each sub-collection of texts.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 10th International Conference on Communications (COMM)

自引率

0.00%

发文量