Quantitative and Semantic Analysis of Texts in Turkic Languages using Universal Declaration of Human Rights (UDHR) as a Corpus

A. Adamov, Gozel Khasanova
{"title":"Quantitative and Semantic Analysis of Texts in Turkic Languages using Universal Declaration of Human Rights (UDHR) as a Corpus","authors":"A. Adamov, Gozel Khasanova","doi":"10.1109/AICT55583.2022.10013645","DOIUrl":null,"url":null,"abstract":"Thanks to Web, ubiquitous digital technologies and the increasing usage of digital environment by humans for work, entertainment, education and other activities, huge amounts of textual data is generated and available online. Text is the most informative and at the same time most sophisticated data type in terms of its comprehension by machines. The Text Analytics is a field that involves number of computer science disciplines to process textual data and transforms it into computer readable format suitable for another field of study Natural Language Processing to extract meaning.This research paper is an attempt to apply broad variety of statistical analysis methods to the corpora of several Turkic languages using Universal Declaration of Human Rights as a Corpus. Quantitative Text Analysis as a research area is focused on understanding the human language through statistics and numbers. As the language is the most effective tool to describe the social world, the Quantitative Text Analysis enables social exploration of the rial world at the scale.","PeriodicalId":441475,"journal":{"name":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 16th International Conference on Application of Information and Communication Technologies (AICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AICT55583.2022.10013645","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Thanks to Web, ubiquitous digital technologies and the increasing usage of digital environment by humans for work, entertainment, education and other activities, huge amounts of textual data is generated and available online. Text is the most informative and at the same time most sophisticated data type in terms of its comprehension by machines. The Text Analytics is a field that involves number of computer science disciplines to process textual data and transforms it into computer readable format suitable for another field of study Natural Language Processing to extract meaning.This research paper is an attempt to apply broad variety of statistical analysis methods to the corpora of several Turkic languages using Universal Declaration of Human Rights as a Corpus. Quantitative Text Analysis as a research area is focused on understanding the human language through statistics and numbers. As the language is the most effective tool to describe the social world, the Quantitative Text Analysis enables social exploration of the rial world at the scale.
以《世界人权宣言》为语料库的突厥语语篇数量语义分析
由于网络、无处不在的数字技术以及人类越来越多地使用数字环境进行工作、娱乐、教育和其他活动,大量的文本数据在网上产生和可用。就机器的理解能力而言,文本是信息量最大,同时也是最复杂的数据类型。文本分析是一个涉及许多计算机科学学科的领域,用于处理文本数据并将其转换为适合于另一个研究领域的计算机可读格式,即自然语言处理以提取含义。本研究以《世界人权宣言》为语料库,尝试运用多种统计分析方法对几种突厥语语料库进行分析。定量文本分析作为一个研究领域的重点是通过统计和数字来理解人类语言。由于语言是描述社会世界最有效的工具,定量文本分析可以在规模上对现实世界进行社会探索。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信