从字典定义中学习语义嵌入

Kai-Wen Tuan, Kuan-lin Lee, Jason J. S. Chang
{"title":"从字典定义中学习语义嵌入","authors":"Kai-Wen Tuan, Kuan-lin Lee, Jason J. S. Chang","doi":"10.1145/3582580.3582603","DOIUrl":null,"url":null,"abstract":"We introduce a method for learning to embed word senses as defined in a given set of dictionaries. In our approach, senses as definition triples, are transformed into low-dimension vectors aimed at maximizing the probability of reconstructing the definitions in an autoencoder. The method involves automatically training sense autoencoder for encoding sense definitions, automatically aligning sense definitions, and automatically generating embeddings of arbitrary description. At run-time, queries from users are mapped to the embedding space and re-ranking is performed on the sense definition retrieved. We present a prototype sense definition embedding, SenseNet, that applies the method to two dictionaries. Blind evaluation on a set of real queries shows that the method significantly outperforms a baseline based on the Lesk algorithm. Our methodology clearly supports combining multiple dictionaries resulting in additional improvement in representing sense definitions of multiple dictionaries. Although there is no distinctive header, this is the abstract. This submission template allows authors to submit their papers for review to an ACM Conference or Journal without any output design specifications incorporated at this point in the process. The ACM manuscript template is a single column document that allows authors to type their content into the pre-existing set of paragraph formatting styles applied to the sample placeholder text here. Throughout the document you will find further instructions on how to format your text.","PeriodicalId":138087,"journal":{"name":"Proceedings of the 2022 5th International Conference on Education Technology Management","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Learning Sense Embeddings from Dictionary Definition\",\"authors\":\"Kai-Wen Tuan, Kuan-lin Lee, Jason J. S. Chang\",\"doi\":\"10.1145/3582580.3582603\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We introduce a method for learning to embed word senses as defined in a given set of dictionaries. In our approach, senses as definition triples, are transformed into low-dimension vectors aimed at maximizing the probability of reconstructing the definitions in an autoencoder. The method involves automatically training sense autoencoder for encoding sense definitions, automatically aligning sense definitions, and automatically generating embeddings of arbitrary description. At run-time, queries from users are mapped to the embedding space and re-ranking is performed on the sense definition retrieved. We present a prototype sense definition embedding, SenseNet, that applies the method to two dictionaries. Blind evaluation on a set of real queries shows that the method significantly outperforms a baseline based on the Lesk algorithm. Our methodology clearly supports combining multiple dictionaries resulting in additional improvement in representing sense definitions of multiple dictionaries. Although there is no distinctive header, this is the abstract. This submission template allows authors to submit their papers for review to an ACM Conference or Journal without any output design specifications incorporated at this point in the process. The ACM manuscript template is a single column document that allows authors to type their content into the pre-existing set of paragraph formatting styles applied to the sample placeholder text here. Throughout the document you will find further instructions on how to format your text.\",\"PeriodicalId\":138087,\"journal\":{\"name\":\"Proceedings of the 2022 5th International Conference on Education Technology Management\",\"volume\":\"103 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 5th International Conference on Education Technology Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3582580.3582603\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th International Conference on Education Technology Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582580.3582603","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们介绍了一种学习嵌入给定词典中定义的词义的方法。在我们的方法中,作为定义三元组的感官被转换成低维向量,旨在最大化自编码器中重构定义的概率。该方法包括自动训练用于编码意义定义的意义自编码器,自动对齐意义定义,以及自动生成任意描述的嵌入。在运行时,将来自用户的查询映射到嵌入空间,并对检索到的意义定义执行重新排序。我们提出了一个原型语义定义嵌入,SenseNet,将该方法应用于两个字典。对一组真实查询的盲评估表明,该方法的性能明显优于基于Lesk算法的基线。我们的方法显然支持组合多个字典,从而在表示多个字典的意义定义方面得到了额外的改进。虽然没有特别的标题,但这是摘要。此提交模板允许作者提交论文以供ACM会议或期刊评审,而无需在此过程中加入任何输出设计规范。ACM手稿模板是一个单列文档,允许作者将其内容键入应用于此处示例占位符文本的预先存在的段落格式样式集。在整个文档中,您将找到有关如何格式化文本的进一步说明。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Learning Sense Embeddings from Dictionary Definition
We introduce a method for learning to embed word senses as defined in a given set of dictionaries. In our approach, senses as definition triples, are transformed into low-dimension vectors aimed at maximizing the probability of reconstructing the definitions in an autoencoder. The method involves automatically training sense autoencoder for encoding sense definitions, automatically aligning sense definitions, and automatically generating embeddings of arbitrary description. At run-time, queries from users are mapped to the embedding space and re-ranking is performed on the sense definition retrieved. We present a prototype sense definition embedding, SenseNet, that applies the method to two dictionaries. Blind evaluation on a set of real queries shows that the method significantly outperforms a baseline based on the Lesk algorithm. Our methodology clearly supports combining multiple dictionaries resulting in additional improvement in representing sense definitions of multiple dictionaries. Although there is no distinctive header, this is the abstract. This submission template allows authors to submit their papers for review to an ACM Conference or Journal without any output design specifications incorporated at this point in the process. The ACM manuscript template is a single column document that allows authors to type their content into the pre-existing set of paragraph formatting styles applied to the sample placeholder text here. Throughout the document you will find further instructions on how to format your text.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信