A Character Identification Method using Postpositions for Animate Nouns in Korean Novels

Taekeun Park, Seung-Hoon Kim
{"title":"A Character Identification Method using Postpositions for Animate Nouns in Korean Novels","authors":"Taekeun Park, Seung-Hoon Kim","doi":"10.9716/KITS.2016.15.3.115","DOIUrl":null,"url":null,"abstract":"Submitted:June 20, 2016 1 st Revision:July 20, 2016 Accepted:July 26, 2016 * 본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2016년도 문화기술 연구개발 지원사업으로 수행되었음. ** 단국대학교 응용컴퓨터공학과 교수, 교신저자 ***단국대학교 응용컴퓨터공학과 교수 Novels includes various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author’s pen and imagination. As a result, any proper noun dictionary cannot include all kind of character names which have been created or will be created by authors. In addition, since Korean does not have capitalization feature, character names in Korean are harder to detect than those in English. Fortunately, however, Korean has postpositions, such as “-ege” and “hante”, used by a sentient being or an animate object (noun). We call such postpositions as animate postpositions in this paper. In a previous study, the authors manually selected character names by referencing both Wikipedia and well-known people dictionaries after utilizing Korean morpheme analyzer, a proper noun dictionary, postpositions (e.g., “-ga”, “-eun”, “-neun”, “-eui”, and “-ege”), and titles (e.g., “buin”), in order to extract social networks from three novels translated into or written in Korean. But, the precision, recall, and F-measure rates of character identification are not presented in the study. In this paper, we evaluate the quantitative contribution of animate postpositions to character identification from novels, in terms of precision, recall, and F-measure. The results show that utilizing animate postpositions is a valuable and powerful tool in character identification without a proper noun dictionary from novels translated into or written in Korean. Keyword:Information Extraction, Korean Novels, Character Identification, Postpositions for Animate Nouns, Korean Linguistic Feature 韓國IT서비스學會誌 第15卷 第3號 2016年 9月, pp.115-125 116 Taekeun Park.Seung-Hoon Kim","PeriodicalId":272384,"journal":{"name":"Journal of the Korea society of IT services","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Korea society of IT services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9716/KITS.2016.15.3.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Submitted:June 20, 2016 1 st Revision:July 20, 2016 Accepted:July 26, 2016 * 본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2016년도 문화기술 연구개발 지원사업으로 수행되었음. ** 단국대학교 응용컴퓨터공학과 교수, 교신저자 ***단국대학교 응용컴퓨터공학과 교수 Novels includes various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author’s pen and imagination. As a result, any proper noun dictionary cannot include all kind of character names which have been created or will be created by authors. In addition, since Korean does not have capitalization feature, character names in Korean are harder to detect than those in English. Fortunately, however, Korean has postpositions, such as “-ege” and “hante”, used by a sentient being or an animate object (noun). We call such postpositions as animate postpositions in this paper. In a previous study, the authors manually selected character names by referencing both Wikipedia and well-known people dictionaries after utilizing Korean morpheme analyzer, a proper noun dictionary, postpositions (e.g., “-ga”, “-eun”, “-neun”, “-eui”, and “-ege”), and titles (e.g., “buin”), in order to extract social networks from three novels translated into or written in Korean. But, the precision, recall, and F-measure rates of character identification are not presented in the study. In this paper, we evaluate the quantitative contribution of animate postpositions to character identification from novels, in terms of precision, recall, and F-measure. The results show that utilizing animate postpositions is a valuable and powerful tool in character identification without a proper noun dictionary from novels translated into or written in Korean. Keyword:Information Extraction, Korean Novels, Character Identification, Postpositions for Animate Nouns, Korean Linguistic Feature 韓國IT서비스學會誌 第15卷 第3號 2016年 9月, pp.115-125 116 Taekeun Park.Seung-Hoon Kim
韩文小说中动画名词后置词的人物识别方法
提交:2016年6月20日,1 st修订:7月20日2016年接受:2016年7月26日,*본연구는문화체육관광부및한국콘텐츠진흥원의2016년도문화기술연구개발지원사업으로수행되었음。* *단국대학교응용컴퓨터공학과교수,교신저자* * *단국대학교응용컴퓨터공학과교수小说包括各种角色的名字,根据类型和时空背景的小说和人物的国籍。此外,小说中的人物和他们的名字是由作者的笔和想象力创造的。因此,任何专有名词词典都不可能包含作者已经创建或将要创建的所有类型的字符名称。此外,由于韩国语没有大写特征,因此韩国语的字符名比英语更难识别。然而,幸运的是,韩国语有“-ege”和“hante”等后置词,用于有知觉的生物或有生命的物体(名词)。在本文中,我们称这种后置句为动态后置句。在之前的一项研究中,作者利用韩语语素分析器、专有名词词典、后位(如“-ga”、“-eun”、“-neun”、“-eui”和“-ege”)和标题(如“buin”),通过参考维基百科和知名人物词典,手动选择人物名称,从三本韩语翻译或写作的小说中提取社交网络。但是,字符识别的准确率、召回率和f测量率在研究中没有给出。在本文中,我们评估了动画后置对小说中人物识别的定量贡献,包括精度、召回和F-measure。结果表明,在没有专有名词词典的情况下,利用动画后置是一种有价值的有力工具。关键词:信息提取,韩文小说,人物识别,动物名词后位,韩文语言特征。Seung-Hoon金
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信