DataVisT5：用于联合理解文本和数据可视化的预训练语言模型

arXiv - CS - Databases Pub Date : 2024-08-14 DOI:arxiv-2408.07401

Zhuoyue Wan, Yuanfeng Song, Shuaimin Li, Chen Jason Zhang, Raymond Chi-Wing Wong

{"title":"DataVisT5：用于联合理解文本和数据可视化的预训练语言模型","authors":"Zhuoyue Wan, Yuanfeng Song, Shuaimin Li, Chen Jason Zhang, Raymond Chi-Wing Wong","doi":"arxiv-2408.07401","DOIUrl":null,"url":null,"abstract":"Data visualization (DV) is the fundamental and premise tool to improve the\nefficiency in conveying the insights behind the big data, which has been widely\naccepted in existing data-driven world. Task automation in DV, such as\nconverting natural language queries to visualizations (i.e., text-to-vis),\ngenerating explanations from visualizations (i.e., vis-to-text), answering\nDV-related questions in free form (i.e. FeVisQA), and explicating tabular data\n(i.e., table-to-text), is vital for advancing the field. Despite their\npotential, the application of pre-trained language models (PLMs) like T5 and\nBERT in DV has been limited by high costs and challenges in handling\ncross-modal information, leading to few studies on PLMs for DV. We introduce\n\\textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5\narchitecture through a hybrid objective pre-training and multi-task fine-tuning\nstrategy, integrating text and DV datasets to effectively interpret cross-modal\nsemantics. Extensive evaluations on public datasets show that DataVisT5\nconsistently outperforms current state-of-the-art models on various DV-related\ntasks. We anticipate that DataVisT5 will not only inspire further research on\nvertical PLMs but also expand the range of applications for PLMs.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":"440 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DataVisT5: A Pre-trained Language Model for Jointly Understanding Text and Data Visualization\",\"authors\":\"Zhuoyue Wan, Yuanfeng Song, Shuaimin Li, Chen Jason Zhang, Raymond Chi-Wing Wong\",\"doi\":\"arxiv-2408.07401\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data visualization (DV) is the fundamental and premise tool to improve the\\nefficiency in conveying the insights behind the big data, which has been widely\\naccepted in existing data-driven world. Task automation in DV, such as\\nconverting natural language queries to visualizations (i.e., text-to-vis),\\ngenerating explanations from visualizations (i.e., vis-to-text), answering\\nDV-related questions in free form (i.e. FeVisQA), and explicating tabular data\\n(i.e., table-to-text), is vital for advancing the field. Despite their\\npotential, the application of pre-trained language models (PLMs) like T5 and\\nBERT in DV has been limited by high costs and challenges in handling\\ncross-modal information, leading to few studies on PLMs for DV. We introduce\\n\\\\textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5\\narchitecture through a hybrid objective pre-training and multi-task fine-tuning\\nstrategy, integrating text and DV datasets to effectively interpret cross-modal\\nsemantics. Extensive evaluations on public datasets show that DataVisT5\\nconsistently outperforms current state-of-the-art models on various DV-related\\ntasks. We anticipate that DataVisT5 will not only inspire further research on\\nvertical PLMs but also expand the range of applications for PLMs.\",\"PeriodicalId\":501123,\"journal\":{\"name\":\"arXiv - CS - Databases\",\"volume\":\"440 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-08-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2408.07401\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据可视化（Data Visualization，DV）是提高传达大数据背后见解效率的基础和前提工具，在现有的数据驱动世界中已被广泛接受。DV 中的任务自动化，如将自然语言查询转换为可视化（即文本到可视化）、从可视化生成解释（即可视化到文本）、以自由形式回答 DV 相关问题（即 FeVisQA）以及阐释表格数据（即表格到文本），对于推动该领域的发展至关重要。尽管预训练语言模型（PLMs）（如 T5 和 BERT）潜力巨大，但其在 DV 中的应用却因成本高昂和处理跨模态信息的挑战而受到限制，导致针对 DV 的预训练语言模型的研究寥寥无几。我们介绍了textbf{DataVisT5}，这是一种为DV量身定制的新型PLM，它通过混合目标预训练和多任务微调策略增强了T5架构，整合了文本和DV数据集，从而有效地解释了跨模态语义。在公共数据集上进行的广泛评估表明，DataVisT5 在各种 DV 相关任务上的表现始终优于当前最先进的模型。我们预计，DataVisT5 不仅会激发对垂直 PLM 的进一步研究，而且还会扩大 PLM 的应用范围。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DataVisT5: A Pre-trained Language Model for Jointly Understanding Text and Data Visualization

Data visualization (DV) is the fundamental and premise tool to improve the efficiency in conveying the insights behind the big data, which has been widely accepted in existing data-driven world. Task automation in DV, such as converting natural language queries to visualizations (i.e., text-to-vis), generating explanations from visualizations (i.e., vis-to-text), answering DV-related questions in free form (i.e. FeVisQA), and explicating tabular data (i.e., table-to-text), is vital for advancing the field. Despite their potential, the application of pre-trained language models (PLMs) like T5 and BERT in DV has been limited by high costs and challenges in handling cross-modal information, leading to few studies on PLMs for DV. We introduce \textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5 architecture through a hybrid objective pre-training and multi-task fine-tuning strategy, integrating text and DV datasets to effectively interpret cross-modal semantics. Extensive evaluations on public datasets show that DataVisT5 consistently outperforms current state-of-the-art models on various DV-related tasks. We anticipate that DataVisT5 will not only inspire further research on vertical PLMs but also expand the range of applications for PLMs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Databases

自引率

0.00%

发文量