Transactions of the Association for Computational Linguistics最新文献

筛选
英文 中文
PaniniQA: Enhancing Patient Education Through Interactive Question Answering PaniniQA:通过交互式问题解答加强患者教育
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-08-07 DOI: 10.1162/tacl_a_00616
Pengshan Cai, Zonghai Yao, Fei Liu, Dakuo Wang, Meghan Reilly, Huixue Zhou, Lingxi Li, Yifan Cao, Alok Kapoor, Adarsha S. Bajracharya, D. Berlowitz, Hongfeng Yu
{"title":"PaniniQA: Enhancing Patient Education Through Interactive Question Answering","authors":"Pengshan Cai, Zonghai Yao, Fei Liu, Dakuo Wang, Meghan Reilly, Huixue Zhou, Lingxi Li, Yifan Cao, Alok Kapoor, Adarsha S. Bajracharya, D. Berlowitz, Hongfeng Yu","doi":"10.1162/tacl_a_00616","DOIUrl":"https://doi.org/10.1162/tacl_a_00616","url":null,"abstract":"Abstract A patient portal allows discharged patients to access their personalized discharge instructions in electronic health records (EHRs). However, many patients have difficulty understanding or memorizing their discharge instructions (Zhao et al., 2017). In this paper, we present PaniniQA, a patient-centric interactive question answering system designed to help patients understand their discharge instructions. PaniniQA first identifies important clinical content from patients’ discharge instructions and then formulates patient-specific educational questions. In addition, PaniniQA is also equipped with answer verification functionality to provide timely feedback to correct patients’ misunderstandings. Our comprehensive automatic & human evaluation results demonstrate our PaniniQA is capable of improving patients’ mastery of their medical instructions through effective interactions.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"2 1","pages":"1518-1536"},"PeriodicalIF":10.9,"publicationDate":"2023-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139351516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning to Paraphrase Sentences to Different Complexity Levels 学习仿写不同复杂程度的句子
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-08-04 DOI: 10.1162/tacl_a_00606
Alison Chi, Li-Kuang Chen, Yi-Chen Chang, Shu-Hui Lee, Jason J. S. Chang
{"title":"Learning to Paraphrase Sentences to Different Complexity Levels","authors":"Alison Chi, Li-Kuang Chen, Yi-Chen Chang, Shu-Hui Lee, Jason J. S. Chang","doi":"10.1162/tacl_a_00606","DOIUrl":"https://doi.org/10.1162/tacl_a_00606","url":null,"abstract":"Abstract While sentence simplification is an active research topic in NLP, its adjacent tasks of sentence complexification and same-level paraphrasing are not. To train models on all three tasks, we present two new unsupervised datasets. We compare these datasets, one labeled by a weak classifier and the other by a rule-based approach, with a single supervised dataset. Using these three datasets for training, we perform extensive experiments on both multitasking and prompting strategies. Compared to other systems trained on unsupervised parallel data, models trained on our weak classifier labeled dataset achieve state-of-the-art performance on the ASSET simplification benchmark. Our models also outperform previous work on sentence-level targeting. Finally, we establish how a handful of Large Language Models perform on these tasks under a zero-shot setting.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"94 1","pages":"1332-1354"},"PeriodicalIF":10.9,"publicationDate":"2023-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139351690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collective Human Opinions in Semantic Textual Similarity 语义文本相似度中的人类集体意见
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-08-01 DOI: 10.1162/tacl_a_00584
Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, K. Verspoor
{"title":"Collective Human Opinions in Semantic Textual Similarity","authors":"Yuxia Wang, Shimin Tao, Ning Xie, Hao Yang, Timothy Baldwin, K. Verspoor","doi":"10.1162/tacl_a_00584","DOIUrl":"https://doi.org/10.1162/tacl_a_00584","url":null,"abstract":"Abstract Despite the subjective nature of semantic textual similarity (STS) and pervasive disagreements in STS annotation, existing benchmarks have used averaged human ratings as gold standard. Averaging masks the true distribution of human opinions on examples of low agreement, and prevents models from capturing the semantic vagueness that the individual ratings represent. In this work, we introduce USTS, the first Uncertainty-aware STS dataset with ∼15,000 Chinese sentence pairs and 150,000 labels, to study collective human opinions in STS. Analysis reveals that neither a scalar nor a single Gaussian fits a set of observed judgments adequately. We further show that current STS models cannot capture the variance caused by human disagreement on individual instances, but rather reflect the predictive confidence over the aggregate dataset.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"997-1013"},"PeriodicalIF":10.9,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42343606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Time-and-Space-Efficient Weighted Deduction 时间和空间效率加权扣除
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-08-01 DOI: 10.1162/tacl_a_00588
Jason Eisner
{"title":"Time-and-Space-Efficient Weighted Deduction","authors":"Jason Eisner","doi":"10.1162/tacl_a_00588","DOIUrl":"https://doi.org/10.1162/tacl_a_00588","url":null,"abstract":"Abstract Many NLP algorithms have been described in terms of deduction systems. Unweighted deduction allows a generic forward-chaining execution strategy. For weighted deduction, however, efficient execution should propagate the weight of each item only after it has converged. This means visiting the items in topologically sorted order (as in dynamic programming). Toposorting is fast on a materialized graph; unfortunately, materializing the graph would take extra space. Is there a generic weighted deduction strategy which, for every acyclic deduction system and every input, uses only a constant factor more time and space than generic unweighted deduction? After reviewing past strategies, we answer this question in the affirmative by combining ideas of Goodman (1999) and Kahn (1962). We also give an extension to cyclic deduction systems, based on Tarjan (1972).","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"960-973"},"PeriodicalIF":10.9,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64440766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems Multi 3 WOZ:用于训练和评估文化适应性任务导向型对话系统的多语言、多领域、多并行数据集
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-07-26 DOI: 10.1162/tacl_a_00609
Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulic, A. Korhonen
{"title":"Multi 3 WOZ: A Multilingual, Multi-Domain, Multi-Parallel Dataset for Training and Evaluating Culturally Adapted Task-Oriented Dialog Systems","authors":"Songbo Hu, Han Zhou, Mete Hergul, Milan Gritta, Guchun Zhang, Ignacio Iacobacci, Ivan Vulic, A. Korhonen","doi":"10.1162/tacl_a_00609","DOIUrl":"https://doi.org/10.1162/tacl_a_00609","url":null,"abstract":"Abstract Creating high-quality annotated data for task-oriented dialog (ToD) is known to be notoriously difficult, and the challenges are amplified when the goal is to create equitable, culturally adapted, and large-scale ToD datasets for multiple languages. Therefore, the current datasets are still very scarce and suffer from limitations such as translation-based non-native dialogs with translation artefacts, small scale, or lack of cultural adaptation, among others. In this work, we first take stock of the current landscape of multilingual ToD datasets, offering a systematic overview of their properties and limitations. Aiming to reduce all the detected limitations, we then introduce Multi3WOZ, a novel multilingual, multi-domain, multi-parallel ToD dataset. It is large-scale and offers culturally adapted dialogs in 4 languages to enable training and evaluation of multilingual and cross-lingual ToD systems. We describe a complex bottom–up data collection process that yielded the final dataset, and offer the first sets of baseline scores across different ToD-related tasks for future reference, also highlighting its challenging nature.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"17 1","pages":"1396-1415"},"PeriodicalIF":10.9,"publicationDate":"2023-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139354806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing 跨语言语义解析的最优传输后验对齐
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-07-09 DOI: 10.1162/tacl_a_00611
Tom Sherborne, Tom Hosking, Mirella Lapata
{"title":"Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing","authors":"Tom Sherborne, Tom Hosking, Mirella Lapata","doi":"10.1162/tacl_a_00611","DOIUrl":"https://doi.org/10.1162/tacl_a_00611","url":null,"abstract":"Abstract Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data. Previous work has primarily considered silver-standard data augmentation or zero-shot methods; exploiting few-shot gold data is comparatively unexplored. We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between probabilistic latent variables using Optimal Transport. We demonstrate how this direct guidance improves parsing from natural languages using fewer examples and less training. We evaluate our method on two datasets, MTOP and MultiATIS++SQL, establishing state-of-the-art results under a few-shot cross-lingual regime. Ablation studies further reveal that our method improves performance even without parallel input translations. In addition, we show that our model better captures cross-lingual structure in the latent space to improve semantic representation similarity.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"43 1","pages":"1432-1450"},"PeriodicalIF":10.9,"publicationDate":"2023-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139361256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Testing the Predictions of Surprisal Theory in 11 Languages 在 11 种语言中测试惊奇理论的预测结果
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-07-07 DOI: 10.1162/tacl_a_00612
Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, R. Levy
{"title":"Testing the Predictions of Surprisal Theory in 11 Languages","authors":"Ethan Gotlieb Wilcox, Tiago Pimentel, Clara Meister, Ryan Cotterell, R. Levy","doi":"10.1162/tacl_a_00612","DOIUrl":"https://doi.org/10.1162/tacl_a_00612","url":null,"abstract":"Abstract Surprisal theory posits that less-predictable words should take more time to process, with word predictability quantified as surprisal, i.e., negative log probability in context. While evidence supporting the predictions of surprisal theory has been replicated widely, much of it has focused on a very narrow slice of data: native English speakers reading English texts. Indeed, no comprehensive multilingual analysis exists. We address this gap in the current literature by investigating the relationship between surprisal and reading times in eleven different languages, distributed across five language families. Deriving estimates from language models trained on monolingual and multilingual corpora, we test three predictions associated with surprisal theory: (i) whether surprisal is predictive of reading times, (ii) whether expected surprisal, i.e., contextual entropy, is predictive of reading times, and (iii) whether the linking function between surprisal and reading times is linear. We find that all three predictions are borne out crosslinguistically. By focusing on a more diverse set of languages, we argue that these results offer the most robust link to date between information theory and incremental language processing across languages.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"5 1","pages":"1451-1470"},"PeriodicalIF":10.9,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139362131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rank-Aware Negative Training for Semi-Supervised Text Classification 半监督文本分类的秩感知负训练
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-06-13 DOI: 10.1162/tacl_a_00574
Ahmed Murtadha, Shengfeng Pan, Wen Bo, Jianlin Su, Xinxin Cao, Wenze Zhang, Yunfeng Liu
{"title":"Rank-Aware Negative Training for Semi-Supervised Text Classification","authors":"Ahmed Murtadha, Shengfeng Pan, Wen Bo, Jianlin Su, Xinxin Cao, Wenze Zhang, Yunfeng Liu","doi":"10.1162/tacl_a_00574","DOIUrl":"https://doi.org/10.1162/tacl_a_00574","url":null,"abstract":"Abstract Semi-supervised text classification-based paradigms (SSTC) typically employ the spirit of self-training. The key idea is to train a deep classifier on limited labeled texts and then iteratively predict the unlabeled texts as their pseudo-labels for further training. However, the performance is largely affected by the accuracy of pseudo-labels, which may not be significant in real-world scenarios. This paper presents a Rank-aware Negative Training (RNT) framework to address SSTC in learning with noisy label settings. To alleviate the noisy information, we adapt a reasoning with uncertainty-based approach to rank the unlabeled texts based on the evidential support received from the labeled texts. Moreover, we propose the use of negative training to train RNT based on the concept that “the input instance does not belong to the complementary label”. A complementary label is randomly selected from all labels except the label on-target. Intuitively, the probability of a true label serving as a complementary label is low and thus provides less noisy information during the training, resulting in better performance on the test data. Finally, we evaluate the proposed solution on various text classification benchmark datasets. Our extensive experiments show that it consistently overcomes the state-of-the-art alternatives in most scenarios and achieves competitive performance in the others. The code of RNT is publicly available on GitHub.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"771-786"},"PeriodicalIF":10.9,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49384920","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Cross-Linguistic Pressure for Uniform Information Density in Word Order 统一语序信息密度的跨语言压力
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-06-06 DOI: 10.1162/tacl_a_00589
T. Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy Mit, E. Zurich, U. Cambridge, Saarland University, UC Irvine
{"title":"A Cross-Linguistic Pressure for Uniform Information Density in Word Order","authors":"T. Clark, Clara Meister, Tiago Pimentel, Michael Hahn, Ryan Cotterell, Richard Futrell, Roger Levy Mit, E. Zurich, U. Cambridge, Saarland University, UC Irvine","doi":"10.1162/tacl_a_00589","DOIUrl":"https://doi.org/10.1162/tacl_a_00589","url":null,"abstract":"Abstract While natural languages differ widely in both canonical word order and word order flexibility, their word orders still follow shared cross-linguistic statistical patterns, often attributed to functional pressures. In the effort to identify these pressures, prior work has compared real and counterfactual word orders. Yet one functional pressure has been overlooked in such investigations: The uniform information density (UID) hypothesis, which holds that information should be spread evenly throughout an utterance. Here, we ask whether a pressure for UID may have influenced word order patterns cross-linguistically. To this end, we use computational models to test whether real orders lead to greater information uniformity than counterfactual orders. In our empirical study of 10 typologically diverse languages, we find that: (i) among SVO languages, real word orders consistently have greater uniformity than reverse word orders, and (ii) only linguistically implausible counterfactual orders consistently exceed the uniformity of real orders. These findings are compatible with a pressure for information uniformity in the development and usage of natural languages.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"1048-1065"},"PeriodicalIF":10.9,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48703963","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Supervised Gradual Machine Learning for Aspect-Term Sentiment Analysis 用于方面项情感分析的监督渐进机器学习
IF 10.9 1区 计算机科学
Transactions of the Association for Computational Linguistics Pub Date : 2023-06-01 DOI: 10.1162/tacl_a_00571
Yanyan Wang, Qun Chen, Murtadha Ahmed, Zhaoqiang Chen, Jing Su, Wei Pan, Zhanhuai Li
{"title":"Supervised Gradual Machine Learning for Aspect-Term Sentiment Analysis","authors":"Yanyan Wang, Qun Chen, Murtadha Ahmed, Zhaoqiang Chen, Jing Su, Wei Pan, Zhanhuai Li","doi":"10.1162/tacl_a_00571","DOIUrl":"https://doi.org/10.1162/tacl_a_00571","url":null,"abstract":"Recent work has shown that Aspect-Term Sentiment Analysis (ATSA) can be effectively performed by Gradual Machine Learning (GML). However, the performance of the current unsupervised solution is limited by inaccurate and insufficient knowledge conveyance. In this paper, we propose a supervised GML approach for ATSA, which can effectively exploit labeled training data to improve knowledge conveyance. It leverages binary polarity relations between instances, which can be either similar or opposite, to enable supervised knowledge conveyance. Besides the explicit polarity relations indicated by discourse structures, it also separately supervises a polarity classification DNN and a binary Siamese network to extract implicit polarity relations. The proposed approach fulfills knowledge conveyance by modeling detected relations as binary features in a factor graph. Our extensive experiments on real benchmark data show that it achieves the state-of-the-art performance across all the test workloads. Our work demonstrates clearly that, in collaboration with DNN for feature extraction, GML outperforms pure DNN solutions.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"723-739"},"PeriodicalIF":10.9,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49253493","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信