Zhuoyue Wan, Yuanfeng Song, Shuaimin Li, Chen Jason Zhang, Raymond Chi-Wing Wong
{"title":"DataVisT5: A Pre-trained Language Model for Jointly Understanding Text and Data Visualization","authors":"Zhuoyue Wan, Yuanfeng Song, Shuaimin Li, Chen Jason Zhang, Raymond Chi-Wing Wong","doi":"arxiv-2408.07401","DOIUrl":null,"url":null,"abstract":"Data visualization (DV) is the fundamental and premise tool to improve the\nefficiency in conveying the insights behind the big data, which has been widely\naccepted in existing data-driven world. Task automation in DV, such as\nconverting natural language queries to visualizations (i.e., text-to-vis),\ngenerating explanations from visualizations (i.e., vis-to-text), answering\nDV-related questions in free form (i.e. FeVisQA), and explicating tabular data\n(i.e., table-to-text), is vital for advancing the field. Despite their\npotential, the application of pre-trained language models (PLMs) like T5 and\nBERT in DV has been limited by high costs and challenges in handling\ncross-modal information, leading to few studies on PLMs for DV. We introduce\n\\textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5\narchitecture through a hybrid objective pre-training and multi-task fine-tuning\nstrategy, integrating text and DV datasets to effectively interpret cross-modal\nsemantics. Extensive evaluations on public datasets show that DataVisT5\nconsistently outperforms current state-of-the-art models on various DV-related\ntasks. We anticipate that DataVisT5 will not only inspire further research on\nvertical PLMs but also expand the range of applications for PLMs.","PeriodicalId":501123,"journal":{"name":"arXiv - CS - Databases","volume":"440 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2408.07401","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Data visualization (DV) is the fundamental and premise tool to improve the
efficiency in conveying the insights behind the big data, which has been widely
accepted in existing data-driven world. Task automation in DV, such as
converting natural language queries to visualizations (i.e., text-to-vis),
generating explanations from visualizations (i.e., vis-to-text), answering
DV-related questions in free form (i.e. FeVisQA), and explicating tabular data
(i.e., table-to-text), is vital for advancing the field. Despite their
potential, the application of pre-trained language models (PLMs) like T5 and
BERT in DV has been limited by high costs and challenges in handling
cross-modal information, leading to few studies on PLMs for DV. We introduce
\textbf{DataVisT5}, a novel PLM tailored for DV that enhances the T5
architecture through a hybrid objective pre-training and multi-task fine-tuning
strategy, integrating text and DV datasets to effectively interpret cross-modal
semantics. Extensive evaluations on public datasets show that DataVisT5
consistently outperforms current state-of-the-art models on various DV-related
tasks. We anticipate that DataVisT5 will not only inspire further research on
vertical PLMs but also expand the range of applications for PLMs.