Compilation and evaluation of paraphrase representation list of compound verbs: Toward development of “Control language for action”

Tomoya Shirai, Kyoko Kanzaki, Hirofumi Yabumoto, H. Isahara
{"title":"Compilation and evaluation of paraphrase representation list of compound verbs: Toward development of “Control language for action”","authors":"Tomoya Shirai, Kyoko Kanzaki, Hirofumi Yabumoto, H. Isahara","doi":"10.1109/ICAICTA.2015.7335351","DOIUrl":null,"url":null,"abstract":"In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.","PeriodicalId":319020,"journal":{"name":"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","volume":"198 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 2nd International Conference on Advanced Informatics: Concepts, Theory and Applications (ICAICTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAICTA.2015.7335351","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In order to realize friendly man-machine communication, machines must understand not only surface expressions of human utterance but also deep meanings of human behavior. We started compilation of “paraphrase representation list of compound verbs” as the first step of investigation and standardization of lexical items which is a part of “control language for action”. We processed the corpus and vectorized the data by using Word2Vec. Using the created vector, we performed a calculation of similarity between the compound verbs and verbs in a corpus by cosine similarity, and created a paraphrase representation list. We got paraphrase expressions for 1899 compound verbs among 3289 compound verbs (including orthographic variants) stored in the compound verb lexicon. We found by this method words which do not exist in the Japanese WordNet. We investigated the words that exist only in the result of automatic extraction, and found that there are 213 unknown words and 227 new synonymous relationship. What is worthy of special mention is that there is 14 differences between the unknown word and a new synonymous relationship, which means we could find 14 words which are stored in the Japanese WordNet, but are not considered as synonyms of a word. We can say that the proposed method is useful for the expansion of paraphrase relationship listed by human intuitions.
复合动词释义表的编制与评价——论“行动控制语言”的发展
为了实现友好的人机交流,机器不仅要理解人类话语的表面表达,还要理解人类行为的深层含义。作为“行动控制语言”的一部分,作为词汇项目调查和规范的第一步,我们开始编制“复合动词释义表”。我们对语料库进行处理,并使用Word2Vec对数据进行矢量化。使用创建的向量,我们通过余弦相似度计算语料库中复合动词和动词之间的相似度,并创建一个释义表示列表。我们从该复合动词词典中存储的3289个复合动词(包括正字法变体)中得到了1899个复合动词的释义表达式。通过这种方法,我们发现了日文词网中不存在的词。我们调查了自动抽取结果中只存在的单词,发现有213个未知单词和227个新的同义关系。特别值得一提的是,未知词和新的同义关系之间有14个不同之处,这意味着我们可以找到14个存储在日文WordNet中的词,但这些词不被认为是一个词的同义词。我们可以说,所提出的方法对人类直觉所列出的释义关系的扩展是有用的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信