Natural Language Engineering最新文献

Start-up activity in the LLM ecosystem 法律硕士生态系统中的创业活动

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-05-14 DOI: 10.1017/s1351324924000032

Robert Dale

引用次数: 0

Automated annotation of parallel bible corpora with cross-lingual semantic concordance 利用跨语言语义对照自动注释平行圣经语料库

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-25 DOI: 10.1017/s135132492300058x

Jens Dörpinghaus

引用次数: 0

Anisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis 各向异性跨度嵌入和高阶推理对核心参照解析的负面影响：实证分析

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-25 DOI: 10.1017/s1351324924000019

Feng Hou, Ruili Wang, See-Kiong Ng, Fangyi Zhu, Michael Witbrock, Steven F. Cahan, Lily Chen, Xiaoyun Jia

{"title":"Anisotropic span embeddings and the negative impact of higher-order inference for coreference resolution: An empirical analysis","authors":"Feng Hou, Ruili Wang, See-Kiong Ng, Fangyi Zhu, Michael Witbrock, Steven F. Cahan, Lily Chen, Xiaoyun Jia","doi":"10.1017/s1351324924000019","DOIUrl":"https://doi.org/10.1017/s1351324924000019","url":null,"abstract":"Coreference resolution is the task of identifying and clustering mentions that refer to the same entity in a document. Based on state-of-the-art deep learning approaches, end-to-end coreference resolution considers all spans as candidate mentions and tackles mention detection and coreference resolution simultaneously. Recently, researchers have attempted to incorporate document-level context using higher-order inference (HOI) to improve end-to-end coreference resolution. However, HOI methods have been shown to have marginal or even negative impact on coreference resolution. In this paper, we reveal the reasons for the negative impact of HOI coreference resolution. Contextualized representations (e.g., those produced by BERT) for building span embeddings have been shown to be highly anisotropic. We show that HOI actually increases and thus worsens the anisotropy of span embeddings and makes it difficult to distinguish between related but distinct entities (e.g., pilots and flight attendants). Instead of using HOI, we propose two methods, Less-Anisotropic Internal Representations (LAIR) and Data Augmentation with Document Synthesis and Mention Swap (DSMS), to learn less-anisotropic span embeddings for coreference resolution. LAIR uses a linear aggregation of the first layer and the topmost layer of contextualized embeddings. DSMS generates more diversified examples of related but distinct entities by synthesizing documents and by mention swapping. Our experiments show that less-anisotropic span embeddings improve the performance significantly (+2.8 F1 gain on the OntoNotes benchmark) reaching new state-of-the-art performance on the GAP dataset.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"10 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139553422","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

How do control tokens affect natural language generation tasks like text simplification 控制标记如何影响文本简化等自然语言生成任务

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-23 DOI: 10.1017/s1351324923000566

Zihao Li, Matthew Shardlow

引用次数: 0

Emerging trends: When can users trust GPT, and when should they intervene? 新趋势：用户何时可以信任 GPT，何时应该进行干预？

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-16 DOI: 10.1017/s1351324923000578

Kenneth Church

引用次数: 0

Lightweight transformers for clinical natural language processing 用于临床自然语言处理的轻量级转换器

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-12 DOI: 10.1017/s1351324923000542

Omid Rohanian, Mohammadmahdi Nouriborji, Hannah Jauncey, Samaneh Kouchaki, Farhad Nooralahzadeh, ISARIC Clinical Characterisation Group, Lei Clifton, Laura Merson, David A. Clifton

{"title":"Lightweight transformers for clinical natural language processing","authors":"Omid Rohanian, Mohammadmahdi Nouriborji, Hannah Jauncey, Samaneh Kouchaki, Farhad Nooralahzadeh, ISARIC Clinical Characterisation Group, Lei Clifton, Laura Merson, David A. Clifton","doi":"10.1017/s1351324923000542","DOIUrl":"https://doi.org/10.1017/s1351324923000542","url":null,"abstract":"Specialised pre-trained language models are becoming more frequent in Natural language Processing (NLP) since they can potentially outperform models trained on generic texts. BioBERT (Sanh et al., Distilbert, a distilled version of bert: smaller, faster, cheaper and lighter. arXiv preprint arXiv: 1910.01108, 2019) and BioClinicalBERT (Alsentzer et al., Publicly available clinical bert embeddings. In Proceedings of the 2nd Clinical Natural Language Processing Workshop, pp. 72–78, 2019) are two examples of such models that have shown promise in medical NLP tasks. Many of these models are overparametrised and resource-intensive, but thanks to techniques like knowledge distillation, it is possible to create smaller versions that perform almost as well as their larger counterparts. In this work, we specifically focus on development of compact language models for processing clinical texts (i.e. progress notes, discharge summaries, etc). We developed a number of efficient lightweight clinical transformers using knowledge distillation and continual learning, with the number of parameters ranging from <img data-mimesubtype=\"png\" data-type=\"\" src=\"https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20240111120239472-0609:S1351324923000542:S1351324923000542_inline1.png\">$15$</img> million to <img data-mimesubtype=\"png\" data-type=\"\" src=\"https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20240111120239472-0609:S1351324923000542:S1351324923000542_inline2.png\">$65$</img> million. These models performed comparably to larger models such as BioBERT and ClinicalBioBERT and significantly outperformed other compact models trained on general or biomedical data. Our extensive evaluation was done across several standard datasets and covered a wide range of clinical text-mining tasks, including natural language inference, relation extraction, named entity recognition and sequence classification. To our knowledge, this is the first comprehensive study specifically focused on creating efficient and compact transformers for clinical NLP tasks. The models and code used in this study can be found on our Huggingface profile at https://huggingface.co/nlpie and Github page at https://github.com/nlpie-research/Lightweight-Clinical-Transformers, respectively, promoting reproducibility of our results.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"165 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139462167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Actionable conversational quality indicators for improving task-oriented dialog systems 用于改进任务导向型对话系统的可操作对话质量指标

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-09 DOI: 10.1017/s1351324923000372

Michael Higgins, Dominic Widdows, Beth Ann Hockey, Akshay Hazare, Kristen Howell, Gwen Christian, Sujit Mathi, Chris Brew, Andrew Maurer, George Bonev, Matthew Dunn, Joseph Bradley

{"title":"Actionable conversational quality indicators for improving task-oriented dialog systems","authors":"Michael Higgins, Dominic Widdows, Beth Ann Hockey, Akshay Hazare, Kristen Howell, Gwen Christian, Sujit Mathi, Chris Brew, Andrew Maurer, George Bonev, Matthew Dunn, Joseph Bradley","doi":"10.1017/s1351324923000372","DOIUrl":"https://doi.org/10.1017/s1351324923000372","url":null,"abstract":"Automatic dialog systems have become a mainstream part of online customer service. Many such systems are built, maintained, and improved by customer service specialists, rather than dialog systems engineers and computer programmers. As conversations between people and machines become commonplace, it is critical to understand what is working, what is not, and what actions can be taken to reduce the frequency of inappropriate system responses. These analyses and recommendations need to be presented in terms that directly reflect the user experience rather than the internal dialog processing. This paper introduces and explains the use of Actionable Conversational Quality Indicators (ACQIs), which are used both to recognize parts of dialogs that can be improved and to recommend how to improve them. This combines benefits of previous approaches, some of which have focused on producing dialog quality scoring while others have sought to categorize the types of errors the dialog system is making. We demonstrate the effectiveness of using ACQIs on LivePerson internal dialog systems used in commercial customer service applications and on the publicly available LEGOv2 conversational dataset. We report on the annotation and analysis of conversational datasets showing which ACQIs are important to fix in various situations. The annotated datasets are then used to build a predictive model which uses a turn-based vector embedding of the message texts and achieves a 79% weighted average f1-measure at the task of finding the correct ACQI for a given conversation. We predict that if such a model worked perfectly, the range of potential improvement actions a bot-builder must consider at each turn could be reduced by an average of 81%.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"151 1","pages":""},"PeriodicalIF":2.5,"publicationDate":"2024-01-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139410430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A year’s a long time in generative AI 生成式人工智能的漫长岁月

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2024-01-08 DOI: 10.1017/s1351324923000554

Robert Dale

引用次数: 0

OffensEval 2023: Offensive language identification in the age of Large Language Models 大型语言模型时代的攻击性语言识别

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2023-12-06 DOI: 10.1017/s1351324923000517

Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe

{"title":"OffensEval 2023: Offensive language identification in the age of Large Language Models","authors":"Marcos Zampieri, Sara Rosenthal, Preslav Nakov, Alphaeus Dmonte, Tharindu Ranasinghe","doi":"10.1017/s1351324923000517","DOIUrl":"https://doi.org/10.1017/s1351324923000517","url":null,"abstract":"The OffensEval shared tasks organized as part of SemEval-2019–2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.","PeriodicalId":49143,"journal":{"name":"Natural Language Engineering","volume":"187 ","pages":""},"PeriodicalIF":2.5,"publicationDate":"2023-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138506470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Preface: Special issue on NLP approaches to offensive content online 序言:关于网络攻击性内容的NLP方法的特刊

IF 2.5 3区计算机科学

Natural Language Engineering Pub Date : 2023-12-06 DOI: 10.1017/s1351324923000499

Marcos Zampieri, Isabelle Augenstein, Siddharth Krishnan, Joshua Melton, Preslav Nakov

引用次数: 0