Auto-POS templates and mixed metrics for recognizing terms in scientific literature

Hongliang You, Wei Zhang, Junyi Shen, Yang Yu, Ting Liu
{"title":"Auto-POS templates and mixed metrics for recognizing terms in scientific literature","authors":"Hongliang You, Wei Zhang, Junyi Shen, Yang Yu, Ting Liu","doi":"10.1109/KAM.2010.5646319","DOIUrl":null,"url":null,"abstract":"Automatic Term Recognition (ATR) is an important task for Knowledge Acquisition, which aims at acquiring formalized words which are not recorded in time in the glossary. In recent years, several statistical methods has proved to be effective, and emerging methods such as C-value, NC-Value, TermExtractor has shown great advantages on this task. However, few works have been done on the Metric mixing algorithm that combines those metrics as a whole. In this paper, we first collect part-of-speech templates from already-known terms automatically, namely Auto-POS templates, instead of artificial regular expressions, and then we match them with POS strings to acquire candidate terms. Finally we sort those candidates by metric mixing algorithm. Experimental results on IEEE2006-2007 metadata show that the metric mixing algorithm performs better than any separate metrics alone.","PeriodicalId":160788,"journal":{"name":"2010 Third International Symposium on Knowledge Acquisition and Modeling","volume":"243 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 Third International Symposium on Knowledge Acquisition and Modeling","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/KAM.2010.5646319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic Term Recognition (ATR) is an important task for Knowledge Acquisition, which aims at acquiring formalized words which are not recorded in time in the glossary. In recent years, several statistical methods has proved to be effective, and emerging methods such as C-value, NC-Value, TermExtractor has shown great advantages on this task. However, few works have been done on the Metric mixing algorithm that combines those metrics as a whole. In this paper, we first collect part-of-speech templates from already-known terms automatically, namely Auto-POS templates, instead of artificial regular expressions, and then we match them with POS strings to acquire candidate terms. Finally we sort those candidates by metric mixing algorithm. Experimental results on IEEE2006-2007 metadata show that the metric mixing algorithm performs better than any separate metrics alone.
用于识别科学文献中的术语的自动pos模板和混合度量
术语自动识别(ATR)是知识获取的一项重要任务,其目的是获取未及时记录在词汇表中的形式化词汇。近年来,已有几种统计方法被证明是有效的,C-value、NC-Value、TermExtractor等新兴方法在这一任务上显示出很大的优势。然而,将这些度量作为一个整体进行度量混合算法的研究工作却很少。本文首先从已知词中自动收集词性模板,即Auto-POS模板,代替人工正则表达式,然后与词性字符串进行匹配,获得候选词。最后用度量混合算法对候选对象进行排序。在IEEE2006-2007元数据上的实验结果表明,度量混合算法的性能优于单独使用任何单独的度量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信