Natural Language Engineering最新文献_第8页

Killing me softly: Creative and cognitive aspects of implicitness in abusive language online 温柔地杀死我:网络辱骂性语言中隐含的创造性和认知方面

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-08-03 DOI: 10.1017/s1351324922000316

Simona Frenda, V. Patti, Paolo Rosso

引用次数: 3

An empirical study of incorporating syntactic constraints into BERT-based location metonymy resolution 将句法约束纳入基于BERT的位置转喻解析的实证研究

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-08-01 DOI: 10.1017/S135132492200033X

Hao Wang, Siyuan Du, X. Zheng, Li Meng

引用次数: 0

From unified phrase representation to bilingual phrase alignment in an unsupervised manner 从统一短语表示到无监督的双语短语对齐

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-08-01 DOI: 10.1017/S1351324922000328

Jingshu Liu, E. Morin, Sebastian Peña Saldarriaga, Joseph Lark

引用次数: 0

Named-entity recognition in Turkish legal texts 土耳其法律文本中的命名实体识别

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-07-11 DOI: 10.1017/S1351324922000304

Can Çetindağ, Berkay Yazıcıoğlu, Aykut Koç

{"title":"Named-entity recognition in Turkish legal texts","authors":"Can Çetindağ, Berkay Yazıcıoğlu, Aykut Koç","doi":"10.1017/S1351324922000304","DOIUrl":"https://doi.org/10.1017/S1351324922000304","url":null,"abstract":"Abstract Natural language processing (NLP) technologies and applications in legal text processing are gaining momentum. Being one of the most prominent tasks in NLP, named-entity recognition (NER) can substantiate a great convenience for NLP in law due to the variety of named entities in the legal domain and their accentuated importance in legal documents. However, domain-specific NER models in the legal domain are not well studied. We present a NER model for Turkish legal texts with a custom-made corpus as well as several NER architectures based on conditional random fields and bidirectional long-short-term memories (BiLSTMs) to address the task. We also study several combinations of different word embeddings consisting of GloVe, Morph2Vec, and neural network-based character feature extraction techniques either with BiLSTM or convolutional neural networks. We report 92.27% F1 score with a hybrid word representation of GloVe and Morph2Vec with character-level features extracted with BiLSTM. Being an agglutinative language, the morphological structure of Turkish is also considered. To the best of our knowledge, our work is the first legal domain-specific NER study in Turkish and also the first study for an agglutinative language in the legal domain. Thus, our work can also have implications beyond the Turkish language.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"29 1","pages":"615 - 642"},"PeriodicalIF":2.5,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46893283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Neural automated writing evaluation for Korean L2 writing 基于神经网络的韩语二语写作评价

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-07-07 DOI: 10.1017/S1351324922000298

Kyungtae Lim, Jayoung Song, Jungyeul Park

{"title":"Neural automated writing evaluation for Korean L2 writing","authors":"Kyungtae Lim, Jayoung Song, Jungyeul Park","doi":"10.1017/S1351324922000298","DOIUrl":"https://doi.org/10.1017/S1351324922000298","url":null,"abstract":"Abstract Although Korean language education is experiencing rapid growth in recent years and several studies have investigated automated writing evaluation (AWE) systems, AWE for Korean L2 writing still remains unexplored. Therefore, this study aims to develop and validate a state-of-the-art neural model AWE system which can be widely used for Korean language teaching and learning. Based on a Korean learner corpus, the proposed AWE is developed using natural language processing techniques such as part-of-speech tagging, syntactic parsing, and statistical language modeling to engineer linguistic features and a pre-trained neural language model. This study attempted to determine how neural network models use different linguistic features to improve AWE performance. Experimental results of the proposed AWE tool showed that the neural AWE system achieves high reliability for unseen test data from the corpus, which implies metrics used in the AWE system can help differentiate different proficiency levels and predict holistic scores. Furthermore, the results confirmed that the proposed linguistic features–syntactic complexity, quantitative complexity, and fluency–offer benefits that complement neural automated writing evaluation.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"29 1","pages":"1341 - 1363"},"PeriodicalIF":2.5,"publicationDate":"2022-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42622491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Construction Grammar Conceptual Network: Coordination-based graph method for semantic association analysis 构建语法概念网络:基于坐标的语义关联分析图方法

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-07-04 DOI: 10.1017/S1351324922000274

Benedikt Perak, Tajana Ban Kirigin

{"title":"Construction Grammar Conceptual Network: Coordination-based graph method for semantic association analysis","authors":"Benedikt Perak, Tajana Ban Kirigin","doi":"10.1017/S1351324922000274","DOIUrl":"https://doi.org/10.1017/S1351324922000274","url":null,"abstract":"Abstract In this article, we present the Construction Grammar Conceptual Network method, developed for identifying lexical similarity and word sense discrimination in a syntactically tagged corpus, based on the cognitive linguistic assumption that coordination construction instantiates conceptual relatedness. This graph analysis method projects a semantic value onto a given coordinated syntactic dependency and constructs a second-order lexical network of lexical collocates with a high co-occurrence measure. The subsequent process of clustering and pruning the graph reveals lexical communities with high conceptual similarity, which are interpreted as associated senses of the source lexeme. We demonstrate the theory and its application to the task of identifying the conceptual structure and different meanings of nouns, adjectives and verbs using examples from different corpora, and explain the modulating effects of linguistic and graph parameters. This graph approach is based on syntactic dependency processing and can be used as a complementary method to other contemporary natural language processing resources to enrich semantic tasks such as word disambiguation, domain relatedness, sense structure, identification of synonymy, metonymy, and metaphoricity, as well as to automate comprehensive meta-reasoning about languages and identify cross/intra-cultural discourse variations of prototypical conceptualization patterns and knowledge representations. As a contribution, we provide a web-based app at http://emocnet.uniri.hr/.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"29 1","pages":"584 - 614"},"PeriodicalIF":2.5,"publicationDate":"2022-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49089089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Artificial fine-tuning tasks for yes/no question answering 人工微调是/否问题回答任务

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-06-30 DOI: 10.1017/s1351324922000286

Dimitris Dimitriadis, Grigorios Tsoumakas

{"title":"Artificial fine-tuning tasks for yes/no question answering","authors":"Dimitris Dimitriadis, Grigorios Tsoumakas","doi":"10.1017/s1351324922000286","DOIUrl":"https://doi.org/10.1017/s1351324922000286","url":null,"abstract":"\u0000 Current research in yes/no question answering (QA) focuses on transfer learning techniques and transformer-based models. Models trained on large corpora are fine-tuned on tasks similar to yes/no QA, and then the captured knowledge is transferred for solving the yes/no QA task. Most previous studies use existing similar tasks, such as natural language inference or extractive QA, for the fine-tuning step. This paper follows a different perspective, hypothesizing that an artificial yes/no task can transfer useful knowledge for improving the performance of yes/no QA. We introduce three such tasks for this purpose, by adapting three corresponding existing tasks: candidate answer validation, sentiment classification, and lexical simplification. Furthermore, we experimented with three different variations of the BERT model (BERT base, RoBERTa, and ALBERT). The results show that our hypothesis holds true for all artificial tasks, despite the small size of the corresponding datasets that are used for the fine-tuning process, the differences between these tasks, the decisions that we made to adapt the original ones, and the tasks’ simplicity. This gives an alternative perspective on how to deal with the yes/no QA problem, that is more creative, and at the same time more flexible, as it can exploit multiple other existing tasks and corresponding datasets to improve yes/no QA models.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":" ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2022-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42369049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Automated hate speech detection and span extraction in underground hacking and extremist forums 自动仇恨言论检测和跨度提取地下黑客和极端主义论坛

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-06-20 DOI: 10.1017/S1351324922000262

Linda Zhou, Andrew Caines, Ildiko Pete, Alice Hutchings

{"title":"Automated hate speech detection and span extraction in underground hacking and extremist forums","authors":"Linda Zhou, Andrew Caines, Ildiko Pete, Alice Hutchings","doi":"10.1017/S1351324922000262","DOIUrl":"https://doi.org/10.1017/S1351324922000262","url":null,"abstract":"Abstract Hate speech is any kind of communication that attacks a person or a group based on their characteristics, such as gender, religion and race. Due to the availability of online platforms where people can express their (hateful) opinions, the amount of hate speech is steadily increasing that often leads to offline hate crimes. This paper focuses on understanding and detecting hate speech in underground hacking and extremist forums where cybercriminals and extremists, respectively, communicate with each other, and some of them are associated with criminal activity. Moreover, due to the lengthy posts, it would be beneficial to identify the specific span of text containing hateful content in order to assist site moderators with the removal of hate speech. This paper describes a hate speech dataset composed of posts extracted from HackForums, an online hacking forum, and Stormfront and Incels.co, two extremist forums. We combined our dataset with a Twitter hate speech dataset to train a multi-platform classifier. Our evaluation shows that a classifier trained on multiple sources of data does not always improve the performance compared to a mono-platform classifier. Finally, this is the first work on extracting hate speech spans from longer texts. The paper fine-tunes BERT (Bidirectional Encoder Representations from Transformers) and adopts two approaches – span prediction and sequence labelling. Both approaches successfully extract hateful spans and achieve an F1-score of at least 69%.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"29 1","pages":"1247 - 1274"},"PeriodicalIF":2.5,"publicationDate":"2022-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44662755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

NLE volume 28 issue 4 Cover and Back matter NLE第28卷第4期封面和封底

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-06-16 DOI: 10.1017/s1351324922000250

引用次数: 0

NLE volume 28 issue 4 Cover and Front matter NLE第28卷第4期封面和封面问题

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2022-06-16 DOI: 10.1017/s1351324922000249

R. Mitkov, B. Boguraev

引用次数: 0