{"title":"A novel hybrid approach for text encoding: Cognitive Attention To Syntax model to detect online misinformation","authors":"Géraud Faye , Wassila Ouerdane , Guillaume Gadek , Souhir Gahbiche , Sylvain Gatepaille","doi":"10.1016/j.datak.2023.102230","DOIUrl":null,"url":null,"abstract":"<div><p>Most approaches for text encoding rely on the attention mechanism, at the core of the transformers architecture and large language models. The understanding of this mechanism is still limited and present inconvenients such as lack of interpretability, large requirements of data and low generalization. Based on current understanding of the attention mechanism, we propose CATS (Cognitive Attention To Syntax), a neurosymbolic attention encoding approach based on the syntactic understanding of texts. This approach has on-par to better performance compared to classical attention and displays expected advantages of neurosymbolic AI such as better functioning with little data and better explainability. This layer has been tested on the task of misinformation detection but is general and could be used in any task involving natural language processing.</p></div>","PeriodicalId":55184,"journal":{"name":"Data & Knowledge Engineering","volume":"148 ","pages":"Article 102230"},"PeriodicalIF":2.7000,"publicationDate":"2023-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data & Knowledge Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0169023X23000903","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Most approaches for text encoding rely on the attention mechanism, at the core of the transformers architecture and large language models. The understanding of this mechanism is still limited and present inconvenients such as lack of interpretability, large requirements of data and low generalization. Based on current understanding of the attention mechanism, we propose CATS (Cognitive Attention To Syntax), a neurosymbolic attention encoding approach based on the syntactic understanding of texts. This approach has on-par to better performance compared to classical attention and displays expected advantages of neurosymbolic AI such as better functioning with little data and better explainability. This layer has been tested on the task of misinformation detection but is general and could be used in any task involving natural language processing.
大多数文本编码方法都依赖于注意力机制,这是transformers架构和大型语言模型的核心。对这一机制的理解仍然有限,并存在不便,如缺乏可解释性、对数据的要求大和泛化能力低。基于目前对注意机制的理解,我们提出了一种基于文本句法理解的神经符号注意编码方法CATS(Cognitive attention To Syntax)。与经典注意力相比,这种方法具有更好的性能,并显示了神经符号人工智能的预期优势,如在数据较少的情况下更好地发挥功能和更好的可解释性。这一层已经在错误信息检测任务中进行了测试,但它是通用的,可以用于任何涉及自然语言处理的任务。
期刊介绍:
Data & Knowledge Engineering (DKE) stimulates the exchange of ideas and interaction between these two related fields of interest. DKE reaches a world-wide audience of researchers, designers, managers and users. The major aim of the journal is to identify, investigate and analyze the underlying principles in the design and effective use of these systems.