Transactions of the Association for Computational Linguistics最新文献_第2页

PaniniQA: Enhancing Patient Education Through Interactive Question Answering PaniniQA：通过交互式问题解答加强患者教育

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-08-07 DOI: 10.1162/tacl_a_00616

Pengshan Cai, Zonghai Yao, Fei Liu, Dakuo Wang, Meghan Reilly, Huixue Zhou, Lingxi Li, Yifan Cao, Alok Kapoor, Adarsha S. Bajracharya, D. Berlowitz, Hongfeng Yu

引用次数: 0

Learning to Paraphrase Sentences to Different Complexity Levels 学习仿写不同复杂程度的句子

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-08-04 DOI: 10.1162/tacl_a_00606

Alison Chi, Li-Kuang Chen, Yi-Chen Chang, Shu-Hui Lee, Jason J. S. Chang

引用次数: 0

Collective Human Opinions in Semantic Textual Similarity 语义文本相似度中的人类集体意见

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-08-01 DOI: 10.1162/tacl_a_00584

Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, K. Verspoor

引用次数: 2

Time-and-Space-Efficient Weighted Deduction 时间和空间效率加权扣除

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-08-01 DOI: 10.1162/tacl_a_00588

Jason Eisner

引用次数: 1

Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems Multi 3 WOZ：用于训练和评估文化适应性任务导向型对话系统的多语言、多领域、多并行数据集

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-07-26 DOI: 10.1162/tacl_a_00609

Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulic, A. Korhonen

{"title":"Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems","authors":"Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulic, A. Korhonen","doi":"10.1162/tacl_a_00609","DOIUrl":"https://doi.org/10.1162/tacl_a_00609","url":null,"abstract":"Abstract Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock of the current landscape of multilingual ToD datasets, offering a systematic overview of their properties and limitations. Aiming to reduce all the detected limitations, we then introduce Multi3WOZ, a novel multilingual, multi-domain, multi-parallel ToD dataset. It is large-scale and offers culturally adapted dialogs in 4 languages to enable training and evaluation of multilingual and cross-lingual ToD systems. We describe a complex bottom–up data collection process that yielded the final dataset, and offer the first sets of baseline scores across different ToD-related tasks for future reference, also highlighting its challenging nature.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"17 1","pages":"1396-1415"},"PeriodicalIF":10.9,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139354806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing 跨语言语义解析的最优传输后验对齐

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-07-09 DOI: 10.1162/tacl_a_00611

Tom Sherborne, Tom Hosking, Mirella Lapata

引用次数: 0

Testing the Predictions of Surprisal Theory in 11 Languages 在 11 种语言中测试惊奇理论的预测结果

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-07-07 DOI: 10.1162/tacl_a_00612

Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, R. Levy

{"title":"Testing the Predictions of Surprisal Theory in 11 Languages","authors":"Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, R. Levy","doi":"10.1162/tacl_a_00612","DOIUrl":"https://doi.org/10.1162/tacl_a_00612","url":null,"abstract":"Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal theory has been replicated widely, much of it has focused on a very narrow slice of data: native English speakers reading English texts. Indeed, no comprehensive multilingual analysis exists. We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families. Deriving estimates from language models trained on monolingual and multilingual corpora, we test three predictions associated with surprisal theory: (i) whether surprisal is predictive of reading times, (ii) whether expected surprisal, i.e., contextual entropy, is predictive of reading times, and (iii) whether the linking function between surprisal and reading times is linear. We find that all three predictions are borne out crosslinguistically. By focusing on a more diverse set of languages, we argue that these results offer the most robust link to date between information theory and incremental language processing across languages.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"5 1","pages":"1451-1470"},"PeriodicalIF":10.9,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139362131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Rank-Aware Negative Training for Semi-Supervised Text Classification 半监督文本分类的秩感知负训练

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-06-13 DOI: 10.1162/tacl_a_00574

Ahmed Murtadha, Shengfeng Pan, Wen Bo, Jianlin Su, Xinxin Cao, Wenze Zhang, Yunfeng Liu

{"title":"Rank-Aware Negative Training for Semi-Supervised Text Classification","authors":"Ahmed Murtadha, Shengfeng Pan, Wen Bo, Jianlin Su, Xinxin Cao, Wenze Zhang, Yunfeng Liu","doi":"10.1162/tacl_a_00574","DOIUrl":"https://doi.org/10.1162/tacl_a_00574","url":null,"abstract":"Abstract Semi-supervised text classification-based paradigms (SSTC) typically employ the spirit of self-training. The key idea is to train a deep classifier on limited labeled texts and then iteratively predict the unlabeled texts as their pseudo-labels for further training. However, the performance is largely affected by the accuracy of pseudo-labels, which may not be significant in real-world scenarios. This paper presents a Rank-aware Negative Training (RNT) framework to address SSTC in learning with noisy label settings. To alleviate the noisy information, we adapt a reasoning with uncertainty-based approach to rank the unlabeled texts based on the evidential support received from the labeled texts. Moreover, we propose the use of negative training to train RNT based on the concept that “the input instance does not belong to the complementary label”. A complementary label is randomly selected from all labels except the label on-target. Intuitively, the probability of a true label serving as a complementary label is low and thus provides less noisy information during the training, resulting in better performance on the test data. Finally, we evaluate the proposed solution on various text classification benchmark datasets. Our extensive experiments show that it consistently overcomes the state-of-the-art alternatives in most scenarios and achieves competitive performance in the others. The code of RNT is publicly available on GitHub.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"771-786"},"PeriodicalIF":10.9,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49384920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Cross-Linguistic Pressure for Uniform Information Density in Word Order 统一语序信息密度的跨语言压力

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-06-06 DOI: 10.1162/tacl_a_00589

T. Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy Mit, E. Zurich, U. Cambridge, Saarland University, UC Irvine

{"title":"A Cross-Linguistic Pressure for Uniform Information Density in Word Order","authors":"T. Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy Mit, E. Zurich, U. Cambridge, Saarland University, UC Irvine","doi":"10.1162/tacl_a_00589","DOIUrl":"https://doi.org/10.1162/tacl_a_00589","url":null,"abstract":"Abstract While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures. In the effort to identify these pressures, prior work has compared real and counterfactual word orders. Yet one functional pressure has been overlooked in such investigations: The uniform information density (UID) hypothesis, which holds that information should be spread evenly throughout an utterance. Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically. To this end, we use computational models to test whether real orders lead to greater information uniformity than counterfactual orders. In our empirical study of 10 typologically diverse languages, we find that: (i) among SVO languages, real word orders consistently have greater uniformity than reverse word orders, and (ii) only linguistically implausible counterfactual orders consistently exceed the uniformity of real orders. These findings are compatible with a pressure for information uniformity in the development and usage of natural languages.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"1048-1065"},"PeriodicalIF":10.9,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48703963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Supervised Gradual Machine Learning for Aspect-Term Sentiment Analysis 用于方面项情感分析的监督渐进机器学习

IF 10.9 1区计算机科学

Transactions of the Association for Computational Linguistics Pub Date : 2023-06-01 DOI: 10.1162/tacl_a_00571

Yanyan Wang, Qun Chen, Murtadha Ahmed, Zhaoqiang Chen, Jing Su, Wei Pan, Zhanhuai Li

{"title":"Supervised Gradual Machine Learning for Aspect-Term Sentiment Analysis","authors":"Yanyan Wang, Qun Chen, Murtadha Ahmed, Zhaoqiang Chen, Jing Su, Wei Pan, Zhanhuai Li","doi":"10.1162/tacl_a_00571","DOIUrl":"https://doi.org/10.1162/tacl_a_00571","url":null,"abstract":"Recent work has shown that Aspect-Term Sentiment Analysis (ATSA) can be effectively performed by Gradual Machine Learning (GML). However, the performance of the current unsupervised solution is limited by inaccurate and insufficient knowledge conveyance. In this paper, we propose a supervised GML approach for ATSA, which can effectively exploit labeled training data to improve knowledge conveyance. It leverages binary polarity relations between instances, which can be either similar or opposite, to enable supervised knowledge conveyance. Besides the explicit polarity relations indicated by discourse structures, it also separately supervises a polarity classification DNN and a binary Siamese network to extract implicit polarity relations. The proposed approach fulfills knowledge conveyance by modeling detected relations as binary features in a factor graph. Our extensive experiments on real benchmark data show that it achieves the state-of-the-art performance across all the test workloads. Our work demonstrates clearly that, in collaboration with DNN for feature extraction, GML outperforms pure DNN solutions.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"723-739"},"PeriodicalIF":10.9,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49253493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2