{"title":"使用术语标记改进候选术语的选择","authors":"M. Vàzquez, A. Oliver","doi":"10.1075/TERM.00016.VAZ","DOIUrl":null,"url":null,"abstract":"\n The identification of reliable terms from domain-specific corpora using\n computational methods is a task that has to be validated manually by\n specialists, which is a highly time-consuming activity. To reduce this effort\n and improve term candidate selection, we implemented the Token Slot Recognition\n method, a filtering method based on terminological tokens which is used to rank\n extracted term candidates from domain-specific corpora. This paper presents the\n implementation of the term candidates filtering method we developed in\n linguistic and statistical approaches applied for automatic term extraction\n using several domain-specific corpora in different languages. We observed that\n the filtering method outperforms term candidate selection by ranking a higher\n number of terms at the top of the term candidate list than raw frequency, and\n for statistical term extraction the improvement is between 15% and 25% both in\n precision and recall. Our analyses further revealed a reduction in the number of\n term candidates to be validated manually by specialists. In conclusion, the\n number of term candidates extracted automatically from domain-specific corpora\n has been reduced significantly using the Token Slot Recognition filtering\n method, so term candidates can be easily and quickly validated by\n specialists.","PeriodicalId":162784,"journal":{"name":"Computational terminology and filtering of terminological information","volume":"264 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"Improving term candidates selection using terminological\\n tokens\",\"authors\":\"M. Vàzquez, A. Oliver\",\"doi\":\"10.1075/TERM.00016.VAZ\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The identification of reliable terms from domain-specific corpora using\\n computational methods is a task that has to be validated manually by\\n specialists, which is a highly time-consuming activity. To reduce this effort\\n and improve term candidate selection, we implemented the Token Slot Recognition\\n method, a filtering method based on terminological tokens which is used to rank\\n extracted term candidates from domain-specific corpora. This paper presents the\\n implementation of the term candidates filtering method we developed in\\n linguistic and statistical approaches applied for automatic term extraction\\n using several domain-specific corpora in different languages. We observed that\\n the filtering method outperforms term candidate selection by ranking a higher\\n number of terms at the top of the term candidate list than raw frequency, and\\n for statistical term extraction the improvement is between 15% and 25% both in\\n precision and recall. Our analyses further revealed a reduction in the number of\\n term candidates to be validated manually by specialists. In conclusion, the\\n number of term candidates extracted automatically from domain-specific corpora\\n has been reduced significantly using the Token Slot Recognition filtering\\n method, so term candidates can be easily and quickly validated by\\n specialists.\",\"PeriodicalId\":162784,\"journal\":{\"name\":\"Computational terminology and filtering of terminological information\",\"volume\":\"264 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computational terminology and filtering of terminological information\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1075/TERM.00016.VAZ\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computational terminology and filtering of terminological information","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1075/TERM.00016.VAZ","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improving term candidates selection using terminological
tokens
The identification of reliable terms from domain-specific corpora using
computational methods is a task that has to be validated manually by
specialists, which is a highly time-consuming activity. To reduce this effort
and improve term candidate selection, we implemented the Token Slot Recognition
method, a filtering method based on terminological tokens which is used to rank
extracted term candidates from domain-specific corpora. This paper presents the
implementation of the term candidates filtering method we developed in
linguistic and statistical approaches applied for automatic term extraction
using several domain-specific corpora in different languages. We observed that
the filtering method outperforms term candidate selection by ranking a higher
number of terms at the top of the term candidate list than raw frequency, and
for statistical term extraction the improvement is between 15% and 25% both in
precision and recall. Our analyses further revealed a reduction in the number of
term candidates to be validated manually by specialists. In conclusion, the
number of term candidates extracted automatically from domain-specific corpora
has been reduced significantly using the Token Slot Recognition filtering
method, so term candidates can be easily and quickly validated by
specialists.