Findings (Sydney (N.S.W.)最新文献

筛选
英文 中文
Context Generation Improves Open Domain Question Answering 上下文生成改进了开放域问答
Findings (Sydney (N.S.W.) Pub Date : 2022-10-12 DOI: 10.48550/arXiv.2210.06349
Dan Su, M. Patwary, Shrimai Prabhumoye, Peng Xu, R. Prenger, M. Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro
{"title":"Context Generation Improves Open Domain Question Answering","authors":"Dan Su, M. Patwary, Shrimai Prabhumoye, Peng Xu, R. Prenger, M. Shoeybi, Pascale Fung, Anima Anandkumar, Bryan Catanzaro","doi":"10.48550/arXiv.2210.06349","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06349","url":null,"abstract":"Closed-book question answering (QA) requires a model to directly answer an open-domain question without access to any external knowledge. Prior work on closed-book QA either directly finetunes or prompts a pretrained language model (LM) to leverage the stored knowledge. However, they do not fully exploit the parameterized knowledge. To address this inefficiency, we propose a two-stage, closed-book QA framework which employs a coarse-to-fine approach to extract the relevant knowledge and answer a question. We first generate a related context for a given question by prompting a pretrained LM. We then prompt the same LM to generate an answer using the generated context and the question. Additionally, we marginalize over the generated contexts to improve the accuracies and reduce context uncertainty. Experimental results on three QA benchmarks show that our method significantly outperforms previous closed-book QA methods. For example on TriviaQA, our method improves exact match accuracy from 55.3% to 68.6%, and is on par with open-book QA methods (68.6% vs. 68.0%). Our results show that our new methodology is able to better exploit the stored knowledge in pretrained LMs without adding extra learnable parameters or needing finetuning, and paves the way for hybrid models that integrate pretrained LMs with external knowledge.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"781-796"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41763508","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Zero-Shot On-the-Fly Event Schema Induction 零样本即时事件模式归纳
Findings (Sydney (N.S.W.) Pub Date : 2022-10-12 DOI: 10.48550/arXiv.2210.06254
Rotem Dror, Haoyu Wang, D. Roth
{"title":"Zero-Shot On-the-Fly Event Schema Induction","authors":"Rotem Dror, Haoyu Wang, D. Roth","doi":"10.48550/arXiv.2210.06254","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06254","url":null,"abstract":"What are the events involved in a pandemic outbreak? What steps should be taken when planning a wedding? The answers to these questions can be found by collecting many documents on the complex event of interest, extracting relevant information, and analyzing it. We present a new approach in which large language models are utilized to generate source documents that allow predicting, given a high-level event definition, the specific events, arguments, and relations between them to construct a schema that describes the complex event in its entirety.Using our model, complete schemas on any topic can be generated on-the-fly without any manual data collection, i.e., in a zero-shot manner. Moreover, we develop efficient methods to extract pertinent information from texts and demonstrate in a series of experiments that these schemas are considered to be more complete than human-curated ones in the majority of examined scenarios. Finally, we show that this framework is comparable in performance with previous supervised schema induction methods that rely on collecting real texts and even reaching the best score in the prediction task.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"693-713"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42441095","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation PriMeSRL-Eval:语义角色标注系统评价的实用质量度量
Findings (Sydney (N.S.W.) Pub Date : 2022-10-12 DOI: 10.48550/arXiv.2210.06408
Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, H. Kanayama, Marina Danilevsky, Yunyao Li
{"title":"PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation","authors":"Ishan Jindal, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, H. Kanayama, Marina Danilevsky, Yunyao Li","doi":"10.48550/arXiv.2210.06408","DOIUrl":"https://doi.org/10.48550/arXiv.2210.06408","url":null,"abstract":"Semantic role labeling (SRL) identifies the predicate-argument structure in a sentence. This task is usually accomplished in four steps: predicate identification, predicate sense disambiguation, argument identification, and argument classification. Errors introduced at one step propagate to later steps. Unfortunately, the existing SRL evaluation scripts do not consider the full effect of this error propagation aspect. They either evaluate arguments independent of predicate sense (CoNLL09) or do not evaluate predicate sense at all (CoNLL05), yielding an inaccurate SRL model performance on the argument classification task. In this paper, we address key practical issues with existing evaluation scripts and propose a more strict SRL evaluation metric PriMeSRL. We observe that by employing PriMeSRL, the quality evaluation of all SoTA SRL models drops significantly, and their relative rankings also change. We also show that PriMeSRLsuccessfully penalizes actual failures in SoTA SRL models.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1761-1773"},"PeriodicalIF":0.0,"publicationDate":"2022-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44423067","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Machine Translation between Spoken Languages and Signed Languages Represented in SignWriting 手语语言与口语的机器翻译
Findings (Sydney (N.S.W.) Pub Date : 2022-10-11 DOI: 10.48550/arXiv.2210.05404
Zifan Jiang, Amit Moryossef, Mathias Muller, Sarah Ebling
{"title":"Machine Translation between Spoken Languages and Signed Languages Represented in SignWriting","authors":"Zifan Jiang, Amit Moryossef, Mathias Muller, Sarah Ebling","doi":"10.48550/arXiv.2210.05404","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05404","url":null,"abstract":"This paper presents work on novel machine translation (MT) systems between spoken and signed languages, where signed languages are represented in SignWriting, a sign language writing system. Our work seeks to address the lack of out-of-the-box support for signed languages in current MT systems and is based on the SignBank dataset, which contains pairs of spoken language text and SignWriting content. We introduce novel methods to parse, factorize, decode, and evaluate SignWriting, leveraging ideas from neural factored MT. In a bilingual setup—translating from American Sign Language to (American) English—our method achieves over 30 BLEU, while in two multilingual setups—translating in both directions between spoken languages and signed languages—we achieve over 20 BLEU. We find that common MT techniques used to improve spoken language translation similarly affect the performance of sign language translation. These findings validate our use of an intermediate text representation for signed languages to include them in natural language processing research.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1661-1679"},"PeriodicalIF":0.0,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44684579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities ViLPAct:多模式人类活动的合成概括基准
Findings (Sydney (N.S.W.) Pub Date : 2022-10-11 DOI: 10.48550/arXiv.2210.05556
Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu
{"title":"ViLPAct: A Benchmark for Compositional Generalization on Multimodal Human Activities","authors":"Terry Yue Zhuo, Yaqing Liao, Yuecheng Lei, Lizhen Qu, Gerard de Melo, Xiaojun Chang, Yazhou Ren, Zenglin Xu","doi":"10.48550/arXiv.2210.05556","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05556","url":null,"abstract":"We introduce {dataset, a novel vision-language benchmark for human activity planning. It is designed for a task where embodied AI agents can reason and forecast future actions of humans based on video clips about their initial activities and intents in text. The dataset consists of 2.9k videos from {charades extended with intents via crowdsourcing, a multi-choice question test set, and four strong baselines. One of the baselines implements a neurosymbolic approach based on a multi-modal knowledge base (MKB), while the other ones are deep generative models adapted from recent state-of-the-art (SOTA) methods. According to our extensive experiments, the key challenges are compositional generalization and effective use of information from both modalities.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"2147-2162"},"PeriodicalIF":0.0,"publicationDate":"2022-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42063677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hierarchical3D Adapters for Long Video-to-text Summarization 用于长视频到文本摘要的层次结构3D适配器
Findings (Sydney (N.S.W.) Pub Date : 2022-10-10 DOI: 10.48550/arXiv.2210.04829
Pinelopi Papalampidi, Mirella Lapata
{"title":"Hierarchical3D Adapters for Long Video-to-text Summarization","authors":"Pinelopi Papalampidi, Mirella Lapata","doi":"10.48550/arXiv.2210.04829","DOIUrl":"https://doi.org/10.48550/arXiv.2210.04829","url":null,"abstract":"In this paper, we focus on video-to-text summarization and investigate how to best utilize multimodal information for summarizing long inputs (e.g., an hour-long TV show) into long outputs (e.g., a multi-sentence summary). We extend SummScreen (Chen et al., 2022), a dialogue summarization dataset consisting of transcripts of TV episodes with reference summaries, and create a multimodal variant by collecting corresponding full-length videos. We incorporate multimodal information into a pre-trained textual summarizer efficiently using adapter modules augmented with a hierarchical structure while tuning only 3.8% of model parameters. Our experiments demonstrate that multimodal information offers superior performance over more memory-heavy and fully fine-tuned textual summarization methods.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"1267-1290"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49254426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks 用FIRe对抗FIRe:评估文本到视频检索基准的有效性
Findings (Sydney (N.S.W.) Pub Date : 2022-10-10 DOI: 10.48550/arXiv.2210.05038
Pedro Rodriguez, Mahmoud Azab, Becka Silvert, Renato Sanchez, Linzy Labson, Hardik Shah, Seungwhan Moon
{"title":"Fighting FIRe with FIRE: Assessing the Validity of Text-to-Video Retrieval Benchmarks","authors":"Pedro Rodriguez, Mahmoud Azab, Becka Silvert, Renato Sanchez, Linzy Labson, Hardik Shah, Seungwhan Moon","doi":"10.48550/arXiv.2210.05038","DOIUrl":"https://doi.org/10.48550/arXiv.2210.05038","url":null,"abstract":"Searching troves of videos with textual descriptions is a core multimodal retrieval task. Owing to the lack of a purpose-built dataset for text-to-video retrieval, video captioning datasets have been re-purposed to evaluate models by (1) treating captions as positive matches to their respective videos and (2) assuming all other videos to be negatives. However, this methodology leads to a fundamental flaw during evaluation: since captions are marked as relevant only to their original video, many alternate videos also match the caption, which introduces false-negative caption-video pairs. We show that when these false negatives are corrected, a recent state-of-the-art model gains 25% recall points—a difference that threatens the validity of the benchmark itself. To diagnose and mitigate this issue, we annotate and release 683K additional caption-video pairs. Using these, we recompute effectiveness scores for three models on two standard benchmarks (MSR-VTT and MSVD). We find that (1) the recomputed metrics are up to 25% recall points higher for the best models, (2) these benchmarks are nearing saturation for Recall@10, (3) caption length (generality) is related to the number of positives, and (4) annotation costs can be mitigated through sampling. We recommend retiring these benchmarks in their current form, and we make recommendations for future text-to-video retrieval benchmarks.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":"1 1","pages":"47-68"},"PeriodicalIF":0.0,"publicationDate":"2022-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46065962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating Rules for Aggregating Satisfaction with Activity-travel Episodes to a Day-level Satisfaction Measure 将对活动旅行事件的满意度汇总为一天级别满意度测量的评估规则
Findings (Sydney (N.S.W.) Pub Date : 2022-10-03 DOI: 10.32866/001c.38543
Wenbo Guo, T. Schwanen, C. Brand, Y. Chai
{"title":"Evaluating Rules for Aggregating Satisfaction with Activity-travel Episodes to a Day-level Satisfaction Measure","authors":"Wenbo Guo, T. Schwanen, C. Brand, Y. Chai","doi":"10.32866/001c.38543","DOIUrl":"https://doi.org/10.32866/001c.38543","url":null,"abstract":"The recent interest in developing subjective wellbeing aggregation rules in transport research has triggered dialogue across disciplines. Here we analyze how 10 different aggregation rules result in different day-level indicators of satisfaction based on separate measures for each activity and trip on the day and compare the resulting distribution of day-level scores with those for life satisfaction. We find that the normative rules outperform the heuristic rules and are best used to create day-level indicators of satisfaction with activities and trips if the aim is to mimic the statistical distribution for life satisfaction scores.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49670100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining Pre- and Post-Pandemic Cross-Border Trips Using Crowdsourced Data at the Second-Busiest US-Mexico Border Community 在第二繁忙的美墨边境社区使用众包数据检查疫情前后的跨境旅行
Findings (Sydney (N.S.W.) Pub Date : 2022-09-27 DOI: 10.32866/001c.38429
Erik Vargas, Okan Gurbuz, I. Sener, R. Aldrete
{"title":"Examining Pre- and Post-Pandemic Cross-Border Trips Using Crowdsourced Data at the Second-Busiest US-Mexico Border Community","authors":"Erik Vargas, Okan Gurbuz, I. Sener, R. Aldrete","doi":"10.32866/001c.38429","DOIUrl":"https://doi.org/10.32866/001c.38429","url":null,"abstract":"The US-Mexico border witnesses frequent cross-border travels for educational, recreational, healthcare, and work purposes, with millions of passenger and commercial vehicles crossing the international border each year. In 2020, pandemic-related travel restrictions were applied to non-US citizens at the US-Mexico border and reshaped cross-border trips. Using crowdsourced data, we explored the mobility changes that the COVID-19 pandemic brought to the second-busiest border region between the United States and Mexico. Results showed that although some patterns remained similar, overall mobility decreased significantly.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47688165","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Interrupted Time Series Analysis of the Sociodemographics of Crash Victims during the Illinois Stay at Home Order 伊利诺斯州住家令期间车祸受害者的社会人口学中断时间序列分析
Findings (Sydney (N.S.W.) Pub Date : 2022-09-23 DOI: 10.32866/001c.38490
Mickey Edwards
{"title":"An Interrupted Time Series Analysis of the Sociodemographics of Crash Victims during the Illinois Stay at Home Order","authors":"Mickey Edwards","doi":"10.32866/001c.38490","DOIUrl":"https://doi.org/10.32866/001c.38490","url":null,"abstract":"The race/ethnicity and gender of motor vehicle crash victims during the 2020 Illinois stay at home order are compared to previous years. The median poverty rate of crash victims are compared across the five years of 2016-20, finding that poverty is strongly associated with Black male and female crash victims. Several contributing crash factors like speed, distracted driving, seat belt use, and intoxication are also compared. Within race/ethnicity females significantly decreased their proportion of crash involvement while males significantly increased theirs. An interrupted time series analysis and a segmented binary logistic regression are used in conjunction with a presentation of summary statistics.","PeriodicalId":73025,"journal":{"name":"Findings (Sydney (N.S.W.)","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42665268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信