AACL Bioflux最新文献

筛选
英文 中文
Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports 死亡还是谋杀?预测杀害女性新闻报道中的责任认知
AACL Bioflux Pub Date : 2022-09-24 DOI: 10.48550/arXiv.2209.12030
Gosse Minnema, Sara Gemelli, C. Zanchi, T. Caselli, M. Nissim
{"title":"Dead or Murdered? Predicting Responsibility Perception in Femicide News Reports","authors":"Gosse Minnema, Sara Gemelli, C. Zanchi, T. Caselli, M. Nissim","doi":"10.48550/arXiv.2209.12030","DOIUrl":"https://doi.org/10.48550/arXiv.2209.12030","url":null,"abstract":"Different linguistic expressions can conceptualize the same event from different viewpoints by emphasizing certain participants over others. Here, we investigate a case where this has social consequences: how do linguistic expressions of gender-based violence (GBV) influence who we perceive as responsible? We build on previous psycholinguistic research in this area and conduct a large-scale perception survey of GBV descriptions automatically extracted from a corpus of Italian newspapers. We then train regression models that predict the salience of GBV participants with respect to different dimensions of perceived responsibility. Our best model (fine-tuned BERT) shows solid overall performance, with large differences between dimensions and participants: salient _focus_ is more predictable than salient _blame_, and perpetrators’ salience is more predictable than victims’ salience. Experiments with ridge regression models using different representations show that features based on linguistic theory similarly to word-based features. Overall, we show that different linguistic choices do trigger different perceptions of responsibility, and that such perceptions can be modelled automatically. This work can be a core instrument to raise awareness of the consequences of different perspectivizations in the general public and in news producers alike.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"230 1","pages":"1078-1090"},"PeriodicalIF":0.0,"publicationDate":"2022-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86803013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Whodunit? Learning to Contrast for Authorship Attribution 侦探小说吗?学习对比作者归属
AACL Bioflux Pub Date : 2022-09-23 DOI: 10.48550/arXiv.2209.11887
Bo Ai, Yuchen Wang, Yugin Tan, Samson Tan
{"title":"Whodunit? Learning to Contrast for Authorship Attribution","authors":"Bo Ai, Yuchen Wang, Yugin Tan, Samson Tan","doi":"10.48550/arXiv.2209.11887","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11887","url":null,"abstract":"Authorship attribution is the task of identifying the author of a given text. The key is finding representations that can differentiate between authors. Existing approaches typically use manually designed features that capture a dataset’s content and style, but these approaches are dataset-dependent and yield inconsistent performance across corpora. In this work, we propose to learn author-specific representations by fine-tuning pre-trained generic language representations with a contrastive objective (Contra-X). We show that Contra-X learns representations that form highly separable clusters for different authors. It advances the state-of-the-art on multiple human and machine authorship attribution benchmarks, enabling improvements of up to 6.8% over cross-entropy fine-tuning. However, we find that Contra-X improves overall accuracy at the cost of sacrificing performance for some authors. Resolving this tension will be an important direction for future work. To the best of our knowledge, we are the first to integrate contrastive learning with pre-trained language model fine-tuning for authorship attribution.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"34 1","pages":"1142-1157"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84023264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Learning Interpretable Latent Dialogue Actions With Less Supervision 在较少监督下学习可解释的潜在对话动作
AACL Bioflux Pub Date : 2022-09-22 DOI: 10.48550/arXiv.2209.11128
Vojtvech Hudevcek, Ondrej Dusek
{"title":"Learning Interpretable Latent Dialogue Actions With Less Supervision","authors":"Vojtvech Hudevcek, Ondrej Dusek","doi":"10.48550/arXiv.2209.11128","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11128","url":null,"abstract":"We present a novel architecture for explainable modeling of task-oriented dialogues with discrete latent variables to represent dialogue actions. Our model is based on variational recurrent neural networks (VRNN) and requires no explicit annotation of semantic information. Unlike previous works, our approach models the system and user turns separately and performs database query modeling, which makes the model applicable to task-oriented dialogues while producing easily interpretable action latent variables. We show that our model outperforms previous approaches with less supervision in terms of perplexity and BLEU on three datasets, and we propose a way to measure dialogue success without the need for expert annotation. Finally, we propose a novel way to explain semantics of the latent variables with respect to system actions.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"16 1","pages":"297-308"},"PeriodicalIF":0.0,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85813092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems 寻求多元推理逻辑:求解数学字题的控制方程表达式生成
AACL Bioflux Pub Date : 2022-09-21 DOI: 10.48550/arXiv.2209.10310
Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, S. Kurohashi
{"title":"Seeking Diverse Reasoning Logic: Controlled Equation Expression Generation for Solving Math Word Problems","authors":"Yibin Shen, Qianying Liu, Zhuoyuan Mao, Zhen Wan, Fei Cheng, S. Kurohashi","doi":"10.48550/arXiv.2209.10310","DOIUrl":"https://doi.org/10.48550/arXiv.2209.10310","url":null,"abstract":"To solve Math Word Problems, human students leverage diverse reasoning logic that reaches different possible equation solutions. However, the mainstream sequence-to-sequence approach of automatic solvers aims to decode a fixed solution equation supervised by human annotation. In this paper, we propose a controlled equation generation solver by leveraging a set of control codes to guide the model to consider certain reasoning logic and decode the corresponding equations expressions transformed from the human reference. The empirical results suggest that our method universally improves the performance on single-unknown (Math23K) and multiple-unknown (DRAW1K, HMWP) benchmarks, with substantial improvements up to 13.2% accuracy on the challenging multiple-unknown datasets.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"68 1","pages":"254-260"},"PeriodicalIF":0.0,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79113952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora 避开数据瓶颈:基于自动分割语料库的自动字幕
AACL Bioflux Pub Date : 2022-09-21 DOI: 10.48550/arXiv.2209.10608
Sara Papi, Alina Karakanta, Matteo Negri, M. Turchi
{"title":"Dodging the Data Bottleneck: Automatic Subtitling with Automatically Segmented ST Corpora","authors":"Sara Papi, Alina Karakanta, Matteo Negri, M. Turchi","doi":"10.48550/arXiv.2209.10608","DOIUrl":"https://doi.org/10.48550/arXiv.2209.10608","url":null,"abstract":"Speech translation for subtitling (SubST) is the task of automatically translating speech data into well-formed subtitles by inserting subtitle breaks compliant to specific displaying guidelines. Similar to speech translation (ST), model training requires parallel data comprising audio inputs paired with their textual translations. In SubST, however, the text has to be also annotated with subtitle breaks. So far, this requirement has represented a bottleneck for system development, as confirmed by the dearth of publicly available SubST corpora. To fill this gap, we propose a method to convert existing ST corpora into SubST resources without human intervention. We build a segmenter model that automatically segments texts into proper subtitles by exploiting audio and text in a multimodal fashion, achieving high segmentation quality in zero-shot conditions. Comparative experiments with SubST systems respectively trained on manual and automatic segmentations result in similar performance, showing the effectiveness of our approach.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"108 1","pages":"480-487"},"PeriodicalIF":0.0,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86258928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
F-coref: Fast, Accurate and Easy to Use Coreference Resolution F-coref:快速,准确和易于使用的共同参考分辨率
AACL Bioflux Pub Date : 2022-09-09 DOI: 10.48550/arXiv.2209.04280
Shon Otmazgin, Arie Cattan, Yoav Goldberg
{"title":"F-coref: Fast, Accurate and Easy to Use Coreference Resolution","authors":"Shon Otmazgin, Arie Cattan, Yoav Goldberg","doi":"10.48550/arXiv.2209.04280","DOIUrl":"https://doi.org/10.48550/arXiv.2209.04280","url":null,"abstract":"We introduce fastcoref, a python package for fast, accurate, and easy-to-use English coreference resolution. The package is pip-installable, and allows two modes: an accurate mode based on the LingMess architecture, providing state-of-the-art coreference accuracy, and a substantially faster model, F-coref, which is the focus of this work. F-coref allows to process 2.8K OntoNotes documents in 25 seconds on a V100 GPU (compared to 6 minutes for the LingMess model, and to 12 minutes of the popular AllenNLP coreference model) with only a modest drop in accuracy. The fast speed is achieved through a combination of distillation of a compact model from the LingMess model, and an efficient batching implementation using a technique we call leftover batching. https://github.com/shon-otmazgin/fastcoref","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"47 1","pages":"48-56"},"PeriodicalIF":0.0,"publicationDate":"2022-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74998841","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA UKP-SQuARE v2:可解释性和可信赖QA的对抗性攻击
AACL Bioflux Pub Date : 2022-08-19 DOI: 10.48550/arXiv.2208.09316
Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, H. Saad, Leonardo F. R. Ribeiro, Iryna Gurevych
{"title":"UKP-SQuARE v2: Explainability and Adversarial Attacks for Trustworthy QA","authors":"Rachneet Sachdeva, Haritz Puerto, Tim Baumgärtner, Sewin Tariverdian, Hao Zhang, Kexin Wang, H. Saad, Leonardo F. R. Ribeiro, Iryna Gurevych","doi":"10.48550/arXiv.2208.09316","DOIUrl":"https://doi.org/10.48550/arXiv.2208.09316","url":null,"abstract":"Question Answering (QA) systems are increasingly deployed in applications where they support real-world decisions. However, state-of-the-art models rely on deep neural networks, which are difficult to interpret by humans. Inherently interpretable models or post hoc explainability methods can help users to comprehend how a model arrives at its prediction and, if successful, increase their trust in the system. Furthermore, researchers can leverage these insights to develop new methods that are more accurate and less biased. In this paper, we introduce SQuARE v2, the new version of SQuARE, to provide an explainability infrastructure for comparing models based on methods such as saliency maps and graph-based explanations. While saliency maps are useful to inspect the importance of each input token for the model’s prediction, graph-based explanations from external Knowledge Graphs enable the users to verify the reasoning behind the model prediction. In addition, we provide multiple adversarial attacks to compare the robustness of QA models. With these explainability methods and adversarial attacks, we aim to ease the research on trustworthy QA models. SQuARE is available on https://square.ukp-lab.de.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"2 1","pages":"28-38"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88122488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation TexPrax:一个道德的、实时的数据收集和注释的消息传递应用程序
AACL Bioflux Pub Date : 2022-08-16 DOI: 10.48550/arXiv.2208.07846
Lorenz Stangier, Ji-Ung Lee, Yuxi Wang, Marvin Müller, Nicholas R. J. Frick, J. Metternich, Iryna Gurevych
{"title":"TexPrax: A Messaging Application for Ethical, Real-time Data Collection and Annotation","authors":"Lorenz Stangier, Ji-Ung Lee, Yuxi Wang, Marvin Müller, Nicholas R. J. Frick, J. Metternich, Iryna Gurevych","doi":"10.48550/arXiv.2208.07846","DOIUrl":"https://doi.org/10.48550/arXiv.2208.07846","url":null,"abstract":"Collecting and annotating task-oriented dialog data is difficult, especially for highly specific domains that require expert knowledge. At the same time, informal communication channels such as instant messengers are increasingly being used at work. This has led to a lot of work-relevant information that is disseminated through those channels and needs to be post-processed manually by the employees. To alleviate this problem, we present TexPrax, a messaging system to collect and annotate _problems_, _causes_, and _solutions_ that occur in work-related chats. TexPrax uses a chatbot to directly engage the employees to provide lightweight annotations on their conversation and ease their documentation work. To comply with data privacy and security regulations, we use an end-to-end message encryption and give our users full control over their data which has various advantages over conventional annotation tools. We evaluate TexPrax in a user-study with German factory employees who ask their colleagues for solutions on problems that arise during their daily work. Overall, we collect 202 task-oriented German dialogues containing 1,027 sentences with sentence-level expert annotations. Our data analysis also reveals that real-world conversations frequently contain instances with code-switching, varying abbreviations for the same entity, and dialects which NLP systems should be able to handle.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"6 1","pages":"9-16"},"PeriodicalIF":0.0,"publicationDate":"2022-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80374544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
PInKS: Preconditioned Commonsense Inference with Minimal Supervision 粉红色:具有最小监督的预设常识推理
AACL Bioflux Pub Date : 2022-06-16 DOI: 10.48550/arXiv.2206.07920
Ehsan Qasemi, Piyush Khanna, Qiang Ning, Muhao Chen
{"title":"PInKS: Preconditioned Commonsense Inference with Minimal Supervision","authors":"Ehsan Qasemi, Piyush Khanna, Qiang Ning, Muhao Chen","doi":"10.48550/arXiv.2206.07920","DOIUrl":"https://doi.org/10.48550/arXiv.2206.07920","url":null,"abstract":"Reasoning with preconditions such as “glass can be used for drinking water unless the glass is shattered” remains an open problem for language models. The main challenge lies in the scarcity of preconditions data and the model’s lack of support for such reasoning. We present PInKS , Preconditioned Commonsense Inference with WeaK Supervision, an improved model for reasoning with preconditions through minimum supervision. We show, empirically and theoretically, that PInKS improves the results on benchmarks focused on reasoning with the preconditions of commonsense knowledge (up to 40% Macro-F1 scores). We further investigate PInKS through PAC-Bayesian informativeness analysis, precision measures, and ablation study.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"6 1","pages":"320-336"},"PeriodicalIF":0.0,"publicationDate":"2022-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87035369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Director: Generator-Classifiers For Supervised Language Modeling 主管:监督语言建模的生成器-分类器
AACL Bioflux Pub Date : 2022-06-15 DOI: 10.48550/arXiv.2206.07694
Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, J. Weston
{"title":"Director: Generator-Classifiers For Supervised Language Modeling","authors":"Kushal Arora, Kurt Shuster, Sainbayar Sukhbaatar, J. Weston","doi":"10.48550/arXiv.2206.07694","DOIUrl":"https://doi.org/10.48550/arXiv.2206.07694","url":null,"abstract":"Current language models achieve low perplexity but their resulting generations still suffer from toxic responses, repetitiveness, and contradictions. The standard language modeling setup fails to address these issues. In this paper, we introduce a new architecture, Director, that consists of a unified generator-classifier with both a language modeling and a classification head for each output token. Training is conducted jointly using both standard language modeling data, and data labeled with desirable and undesirable sequences. Experiments in several settings show that the model has competitive training and decoding speed compared to standard language models while yielding superior results, avoiding undesirable behaviors while maintaining generation quality. It also outperforms existing model guiding approaches in terms of both accuracy and efficiency. Our code is made publicly available.","PeriodicalId":39298,"journal":{"name":"AACL Bioflux","volume":"26 1","pages":"512-526"},"PeriodicalIF":0.0,"publicationDate":"2022-06-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83068655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信