A Neuro Symbolic Approach for Contradiction Detection in Persian Text

Zeinab Rahimi, M. Shamsfard
{"title":"A Neuro Symbolic Approach for Contradiction Detection in Persian Text","authors":"Zeinab Rahimi, M. Shamsfard","doi":"10.3897/jucs.90646","DOIUrl":null,"url":null,"abstract":"Detection of semantic contradictory sentences is a challenging and fundamental issue for some NLP applications, such as textual entailments recognition. In this study, contradiction means different types of semantic confrontation, such as negation, antonymy, and numerical. Due to the lack of sufficient data to apply precise machine learning and, specifically, deep learning methods to Persian and other low-resource languages, rule-based approaches are of great interest. Also, recently, the emergence of new methods such as transfer learning has opened up the possibility of deep learning for low-resource languages. This paper introduces a hybrid contradiction detection approach for detecting seven categories of contradictions in Persian texts: Antonymy, negation, numerical, factive, structural, lexical and world knowledge. The proposed method consists of 1) a novel data mining method and 2) a transformer-based deep neural method for contradiction detection . Also, a simple baseline is presented for comparison. The data mining method uses frequent rule mining to extract appropriate contradiction detection rules employing a development set. Extracted rules are tested for different categories of contradictory sentences. In the first step, a classifier checks whether the rules work for an input sentence pair. Then, according to the result, rules are used for three categories of negation, numerical, and antonym. In this part, the highest F-measure is obtained for detecting the negation category (90%), the average F-measure for these three categories is 86%, and for the other four categories, in which the rules have a lower F-measure of 62%, the transformer-based method achieved 76%. The proposed hybrid approach has an overall f-measure of higher than 80%. ","PeriodicalId":14652,"journal":{"name":"J. Univers. Comput. Sci.","volume":"25 1","pages":"242-264"},"PeriodicalIF":0.0000,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"J. Univers. Comput. Sci.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3897/jucs.90646","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Detection of semantic contradictory sentences is a challenging and fundamental issue for some NLP applications, such as textual entailments recognition. In this study, contradiction means different types of semantic confrontation, such as negation, antonymy, and numerical. Due to the lack of sufficient data to apply precise machine learning and, specifically, deep learning methods to Persian and other low-resource languages, rule-based approaches are of great interest. Also, recently, the emergence of new methods such as transfer learning has opened up the possibility of deep learning for low-resource languages. This paper introduces a hybrid contradiction detection approach for detecting seven categories of contradictions in Persian texts: Antonymy, negation, numerical, factive, structural, lexical and world knowledge. The proposed method consists of 1) a novel data mining method and 2) a transformer-based deep neural method for contradiction detection . Also, a simple baseline is presented for comparison. The data mining method uses frequent rule mining to extract appropriate contradiction detection rules employing a development set. Extracted rules are tested for different categories of contradictory sentences. In the first step, a classifier checks whether the rules work for an input sentence pair. Then, according to the result, rules are used for three categories of negation, numerical, and antonym. In this part, the highest F-measure is obtained for detecting the negation category (90%), the average F-measure for these three categories is 86%, and for the other four categories, in which the rules have a lower F-measure of 62%, the transformer-based method achieved 76%. The proposed hybrid approach has an overall f-measure of higher than 80%. 
波斯语文本矛盾检测的神经符号方法
语义矛盾句的检测是一些自然语言处理应用(如文本蕴涵识别)中一个具有挑战性和基础性的问题。在本研究中,矛盾是指不同类型的语义对抗,如否定、反义词、数词等。由于缺乏足够的数据来应用精确的机器学习,特别是对波斯语和其他低资源语言的深度学习方法,基于规则的方法非常有趣。此外,最近,迁移学习等新方法的出现为低资源语言的深度学习开辟了可能性。本文介绍了一种混合矛盾检测方法,用于检测波斯语文本中的七种矛盾:反义词、否定、数字、事实、结构、词汇和世界知识。该方法由一种新的数据挖掘方法和一种基于变压器的深度神经网络的矛盾检测方法组成。此外,还提供了一个简单的基线进行比较。数据挖掘方法采用频繁规则挖掘,利用开发集提取合适的矛盾检测规则。对所提取的规则进行了不同类别矛盾句的测试。在第一步中,分类器检查规则是否适用于输入句子对。然后,根据结果,对否定、数词和反义词三大类规则进行了应用。在这一部分中,检测否定类获得了最高的f值(90%),这三个类别的平均f值为86%,对于其他四个类别,规则的f值较低,为62%,基于变压器的方法达到76%。所提出的混合方法的总体f值高于80%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信