{"title":"A Character Identification Method using Postpositions for Animate Nouns in Korean Novels","authors":"Taekeun Park, Seung-Hoon Kim","doi":"10.9716/KITS.2016.15.3.115","DOIUrl":null,"url":null,"abstract":"Submitted:June 20, 2016 1 st Revision:July 20, 2016 Accepted:July 26, 2016 * 본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2016년도 문화기술 연구개발 지원사업으로 수행되었음. ** 단국대학교 응용컴퓨터공학과 교수, 교신저자 ***단국대학교 응용컴퓨터공학과 교수 Novels includes various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author’s pen and imagination. As a result, any proper noun dictionary cannot include all kind of character names which have been created or will be created by authors. In addition, since Korean does not have capitalization feature, character names in Korean are harder to detect than those in English. Fortunately, however, Korean has postpositions, such as “-ege” and “hante”, used by a sentient being or an animate object (noun). We call such postpositions as animate postpositions in this paper. In a previous study, the authors manually selected character names by referencing both Wikipedia and well-known people dictionaries after utilizing Korean morpheme analyzer, a proper noun dictionary, postpositions (e.g., “-ga”, “-eun”, “-neun”, “-eui”, and “-ege”), and titles (e.g., “buin”), in order to extract social networks from three novels translated into or written in Korean. But, the precision, recall, and F-measure rates of character identification are not presented in the study. In this paper, we evaluate the quantitative contribution of animate postpositions to character identification from novels, in terms of precision, recall, and F-measure. The results show that utilizing animate postpositions is a valuable and powerful tool in character identification without a proper noun dictionary from novels translated into or written in Korean. Keyword:Information Extraction, Korean Novels, Character Identification, Postpositions for Animate Nouns, Korean Linguistic Feature 韓國IT서비스學會誌 第15卷 第3號 2016年 9月, pp.115-125 116 Taekeun Park.Seung-Hoon Kim","PeriodicalId":272384,"journal":{"name":"Journal of the Korea society of IT services","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Korea society of IT services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.9716/KITS.2016.15.3.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Submitted:June 20, 2016 1 st Revision:July 20, 2016 Accepted:July 26, 2016 * 본 연구는 문화체육관광부 및 한국콘텐츠진흥원의 2016년도 문화기술 연구개발 지원사업으로 수행되었음. ** 단국대학교 응용컴퓨터공학과 교수, 교신저자 ***단국대학교 응용컴퓨터공학과 교수 Novels includes various character names, depending on the genre and the spatio-temporal background of the novels and the nationality of characters. Besides, characters and their names in a novel are created by the author’s pen and imagination. As a result, any proper noun dictionary cannot include all kind of character names which have been created or will be created by authors. In addition, since Korean does not have capitalization feature, character names in Korean are harder to detect than those in English. Fortunately, however, Korean has postpositions, such as “-ege” and “hante”, used by a sentient being or an animate object (noun). We call such postpositions as animate postpositions in this paper. In a previous study, the authors manually selected character names by referencing both Wikipedia and well-known people dictionaries after utilizing Korean morpheme analyzer, a proper noun dictionary, postpositions (e.g., “-ga”, “-eun”, “-neun”, “-eui”, and “-ege”), and titles (e.g., “buin”), in order to extract social networks from three novels translated into or written in Korean. But, the precision, recall, and F-measure rates of character identification are not presented in the study. In this paper, we evaluate the quantitative contribution of animate postpositions to character identification from novels, in terms of precision, recall, and F-measure. The results show that utilizing animate postpositions is a valuable and powerful tool in character identification without a proper noun dictionary from novels translated into or written in Korean. Keyword:Information Extraction, Korean Novels, Character Identification, Postpositions for Animate Nouns, Korean Linguistic Feature 韓國IT서비스學會誌 第15卷 第3號 2016年 9月, pp.115-125 116 Taekeun Park.Seung-Hoon Kim