Lahbib Ajallouda, Kawtar Najmani, A. Zellou, E. Benlahmar
{"title":"Doc2Vec, SBERT, InferSent, and USE Which embedding technique for noun phrases?","authors":"Lahbib Ajallouda, Kawtar Najmani, A. Zellou, E. Benlahmar","doi":"10.1109/IRASET52964.2022.9738300","DOIUrl":null,"url":null,"abstract":"Phrase embedding is a technique of representing phrases in vector space. A very high effort has been made to develop this technique to improve tasking in various natural language processing (NLP) applications. The evaluation of phrase embedding has been presented in many studies, but most of them focused on the intrinsic or extrinsic evaluation process regardless of the type of the phrase (noun phrases, Verb phrases …). In the literature, there is no study evaluating the embedding of noun phrases, knowing that this type is used by many NLP applications, such as automatic key-phrase extraction (AKE), information retrieval, and question answering. In this article, we will present an empirical study to compare the most common phrase embedding techniques, to determine the most suitable for representing noun phrases. Dataset used in the comparison process consists of the noun phrases from the Inspec and SemEval2010 datasets, to which we have added their manually defined synonyms.","PeriodicalId":377115,"journal":{"name":"2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRASET52964.2022.9738300","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4
Abstract
Phrase embedding is a technique of representing phrases in vector space. A very high effort has been made to develop this technique to improve tasking in various natural language processing (NLP) applications. The evaluation of phrase embedding has been presented in many studies, but most of them focused on the intrinsic or extrinsic evaluation process regardless of the type of the phrase (noun phrases, Verb phrases …). In the literature, there is no study evaluating the embedding of noun phrases, knowing that this type is used by many NLP applications, such as automatic key-phrase extraction (AKE), information retrieval, and question answering. In this article, we will present an empirical study to compare the most common phrase embedding techniques, to determine the most suitable for representing noun phrases. Dataset used in the comparison process consists of the noun phrases from the Inspec and SemEval2010 datasets, to which we have added their manually defined synonyms.