{"title":"A Multilevel Center Embedding approach for Sentence Similarity having Complex structures","authors":"ShivKishan Dubey, Narendra Kohli","doi":"10.1109/WCONF58270.2023.10235102","DOIUrl":null,"url":null,"abstract":"The volume of text data, including internet reviews, media posts, and academic articles, has significantly increased in recent years. Text similarity measurements are essential for many applications, including various language processing tasks and IR based systems. However, when dealing with complicated structures like lengthy sentences, having main and subordinate clauses in terms of compound/complex sentences as well, these similarity measurements become more difficult. In this article, we provide a multilevel center embedding method for determining similarity in such text. The suggested method makes use of several embedding levels, such as word, pos, clause, and sentence levels, to capture the intricate structure of text. By constructing the center embedding of a sentence and then iteratively computes the difference between an original center embedding and modified versions of the sentence by applying the center embedding in a leveled manner that introduces a new level of abstraction. Our findings show that, the multilevel center embedding strategy outperforms in category of complicated structured based phrases/sentences.","PeriodicalId":202864,"journal":{"name":"2023 World Conference on Communication & Computing (WCONF)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 World Conference on Communication & Computing (WCONF)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WCONF58270.2023.10235102","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The volume of text data, including internet reviews, media posts, and academic articles, has significantly increased in recent years. Text similarity measurements are essential for many applications, including various language processing tasks and IR based systems. However, when dealing with complicated structures like lengthy sentences, having main and subordinate clauses in terms of compound/complex sentences as well, these similarity measurements become more difficult. In this article, we provide a multilevel center embedding method for determining similarity in such text. The suggested method makes use of several embedding levels, such as word, pos, clause, and sentence levels, to capture the intricate structure of text. By constructing the center embedding of a sentence and then iteratively computes the difference between an original center embedding and modified versions of the sentence by applying the center embedding in a leveled manner that introduces a new level of abstraction. Our findings show that, the multilevel center embedding strategy outperforms in category of complicated structured based phrases/sentences.