古兰经印尼语翻译中概念的加权向量生成

S. Putra, K. Hulliyah, Nashrul Hakiem, R. Iswara, A. Firmansyah
{"title":"古兰经印尼语翻译中概念的加权向量生成","authors":"S. Putra, K. Hulliyah, Nashrul Hakiem, R. Iswara, A. Firmansyah","doi":"10.1145/3011141.3011218","DOIUrl":null,"url":null,"abstract":"This paper presents a work in generating Weighted Vector for each Concept in Indonesian Translation of Quran (ITQ). This task is done in aiming to provide a resource needed in implementing a semantic-based question answering system (QAS) for Indonesian ITQ, particularly in retrieving semantically related verses. Semantic approach on QAS employs Ontology concepts of the domain. Since there is no Ontology for ITQ remains, we built one by utilizing the existing Ontology from Quranic Arabic corpus (http://corpus.quran.com/). Furthermore, each leaf concept that enriched by related Quran verse (as its instance) had a representation vector of terms that occur in the corresponding Quran verse to express how strength the concept in relates with verse terms. This vector is assigned with a weight resulted from applying TFIDF method. From 222 leaf concepts in the Ontology, we applied the process only to those that categorized as a member group of Person, Location, and Time named entity. They are 107 in a total. The result shows that the most strength concept in association with verse terms is syaitan which is scored at 0.895 of 1. In overall, 16.82 % concepts had score that more than 0.4, following by 14.95%, 23.36% and 11.21% concepts scored at more than 0.3 ,0.2 and less than 0.1 respectively, and finally the rest ones were the biggest in volume where 33.64% concepts obtained score more than 0.1 and less than 0.2.","PeriodicalId":247823,"journal":{"name":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","volume":"241 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Generating weighted vector for concepts in indonesian translation of Quran\",\"authors\":\"S. Putra, K. Hulliyah, Nashrul Hakiem, R. Iswara, A. Firmansyah\",\"doi\":\"10.1145/3011141.3011218\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a work in generating Weighted Vector for each Concept in Indonesian Translation of Quran (ITQ). This task is done in aiming to provide a resource needed in implementing a semantic-based question answering system (QAS) for Indonesian ITQ, particularly in retrieving semantically related verses. Semantic approach on QAS employs Ontology concepts of the domain. Since there is no Ontology for ITQ remains, we built one by utilizing the existing Ontology from Quranic Arabic corpus (http://corpus.quran.com/). Furthermore, each leaf concept that enriched by related Quran verse (as its instance) had a representation vector of terms that occur in the corresponding Quran verse to express how strength the concept in relates with verse terms. This vector is assigned with a weight resulted from applying TFIDF method. From 222 leaf concepts in the Ontology, we applied the process only to those that categorized as a member group of Person, Location, and Time named entity. They are 107 in a total. The result shows that the most strength concept in association with verse terms is syaitan which is scored at 0.895 of 1. In overall, 16.82 % concepts had score that more than 0.4, following by 14.95%, 23.36% and 11.21% concepts scored at more than 0.3 ,0.2 and less than 0.1 respectively, and finally the rest ones were the biggest in volume where 33.64% concepts obtained score more than 0.1 and less than 0.2.\",\"PeriodicalId\":247823,\"journal\":{\"name\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"volume\":\"241 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3011141.3011218\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Conference on Information Integration and Web-based Applications and Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3011141.3011218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

摘要

本文介绍了印尼语《古兰经》翻译中各概念加权向量的生成方法。完成这项任务的目的是为印度尼西亚ITQ提供实现基于语义的问答系统(QAS)所需的资源,特别是在检索语义相关的经文方面。QAS的语义方法采用领域的本体概念。由于ITQ没有本体,我们利用现有的古兰经阿拉伯语料库(http://corpus.quran.com/)建立了一个本体。此外,每一个由相关古兰经经文丰富的叶子概念(作为其实例)都有一个对应古兰经经文中出现的术语的表示向量,以表达该概念与经文术语的关系。该向量被赋予一个由TFIDF方法得到的权重。从本体中的222个叶概念中,我们只对那些被分类为Person、Location和Time命名实体的成员组的概念应用了这个过程。他们总共是107人。结果表明,与诗歌相关的强度概念最多的是赛itan,得分为0.895(1分)。总体而言,得分在0.4以上的概念占16.82%,得分在0.3以上的概念占14.95%,得分在0.2以上的概念占23.36%,得分在0.1以下的概念占11.21%,得分在0.1以上和0.2以下的概念占33.64%,数量最多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Generating weighted vector for concepts in indonesian translation of Quran
This paper presents a work in generating Weighted Vector for each Concept in Indonesian Translation of Quran (ITQ). This task is done in aiming to provide a resource needed in implementing a semantic-based question answering system (QAS) for Indonesian ITQ, particularly in retrieving semantically related verses. Semantic approach on QAS employs Ontology concepts of the domain. Since there is no Ontology for ITQ remains, we built one by utilizing the existing Ontology from Quranic Arabic corpus (http://corpus.quran.com/). Furthermore, each leaf concept that enriched by related Quran verse (as its instance) had a representation vector of terms that occur in the corresponding Quran verse to express how strength the concept in relates with verse terms. This vector is assigned with a weight resulted from applying TFIDF method. From 222 leaf concepts in the Ontology, we applied the process only to those that categorized as a member group of Person, Location, and Time named entity. They are 107 in a total. The result shows that the most strength concept in association with verse terms is syaitan which is scored at 0.895 of 1. In overall, 16.82 % concepts had score that more than 0.4, following by 14.95%, 23.36% and 11.21% concepts scored at more than 0.3 ,0.2 and less than 0.1 respectively, and finally the rest ones were the biggest in volume where 33.64% concepts obtained score more than 0.1 and less than 0.2.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信