面向表格数据生成的基于转换器的生成建模

IF 4.6 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Artificial Intelligence Pub Date : 2025-01-20 DOI:10.1016/j.artint.2025.104292

Alex X. Wang , Binh P. Nguyen

{"title":"面向表格数据生成的基于转换器的生成建模","authors":"Alex X. Wang , Binh P. Nguyen","doi":"10.1016/j.artint.2025.104292","DOIUrl":null,"url":null,"abstract":"<div><div>Tabular data synthesis presents unique challenges, with Transformer models remaining underexplored despite the applications of Variational Autoencoders and Generative Adversarial Networks. To address this gap, we propose the Transformer-based Tabular Variational AutoEncoder (TTVAE), leveraging the attention mechanism for capturing complex data distributions. The inclusion of the attention mechanism enables our model to understand complex relationships among heterogeneous features, a task often difficult for traditional methods. TTVAE facilitates the integration of interpolation within the latent space during the data generation process. Specifically, TTVAE is trained once, establishing a low-dimensional representation of real data, and then various latent interpolation methods can efficiently generate synthetic latent points. Through extensive experiments on diverse datasets, TTVAE consistently achieves state-of-the-art performance, highlighting its adaptability across different feature types and data sizes. This innovative approach, empowered by the attention mechanism and the integration of interpolation, addresses the complex challenges of tabular data synthesis, establishing TTVAE as a powerful solution.</div></div>","PeriodicalId":8434,"journal":{"name":"Artificial Intelligence","volume":"340 ","pages":"Article 104292"},"PeriodicalIF":4.6000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TTVAE: Transformer-based generative modeling for tabular data generation\",\"authors\":\"Alex X. Wang , Binh P. Nguyen\",\"doi\":\"10.1016/j.artint.2025.104292\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Tabular data synthesis presents unique challenges, with Transformer models remaining underexplored despite the applications of Variational Autoencoders and Generative Adversarial Networks. To address this gap, we propose the Transformer-based Tabular Variational AutoEncoder (TTVAE), leveraging the attention mechanism for capturing complex data distributions. The inclusion of the attention mechanism enables our model to understand complex relationships among heterogeneous features, a task often difficult for traditional methods. TTVAE facilitates the integration of interpolation within the latent space during the data generation process. Specifically, TTVAE is trained once, establishing a low-dimensional representation of real data, and then various latent interpolation methods can efficiently generate synthetic latent points. Through extensive experiments on diverse datasets, TTVAE consistently achieves state-of-the-art performance, highlighting its adaptability across different feature types and data sizes. This innovative approach, empowered by the attention mechanism and the integration of interpolation, addresses the complex challenges of tabular data synthesis, establishing TTVAE as a powerful solution.</div></div>\",\"PeriodicalId\":8434,\"journal\":{\"name\":\"Artificial Intelligence\",\"volume\":\"340 \",\"pages\":\"Article 104292\"},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2025-01-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0004370225000116\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0004370225000116","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

表格数据合成提出了独特的挑战，尽管有变分自编码器和生成对抗网络的应用，变压器模型仍然没有得到充分的探索。为了解决这一差距，我们提出了基于转换器的表变分自动编码器（TTVAE），利用注意力机制来捕获复杂的数据分布。注意机制的包含使我们的模型能够理解异构特征之间的复杂关系，这是传统方法通常难以完成的任务。TTVAE便于在数据生成过程中对潜在空间内的插值进行整合。具体来说，TTVAE训练一次，建立真实数据的低维表示，然后各种潜在插值方法可以有效地生成合成潜在点。通过在不同数据集上的大量实验，TTVAE始终达到最先进的性能，突出了其在不同特征类型和数据大小上的适应性。这种创新的方法，由注意力机制和内插的集成授权，解决了表格数据合成的复杂挑战，使TTVAE成为一个强大的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TTVAE: Transformer-based generative modeling for tabular data generation

Tabular data synthesis presents unique challenges, with Transformer models remaining underexplored despite the applications of Variational Autoencoders and Generative Adversarial Networks. To address this gap, we propose the Transformer-based Tabular Variational AutoEncoder (TTVAE), leveraging the attention mechanism for capturing complex data distributions. The inclusion of the attention mechanism enables our model to understand complex relationships among heterogeneous features, a task often difficult for traditional methods. TTVAE facilitates the integration of interpolation within the latent space during the data generation process. Specifically, TTVAE is trained once, establishing a low-dimensional representation of real data, and then various latent interpolation methods can efficiently generate synthetic latent points. Through extensive experiments on diverse datasets, TTVAE consistently achieves state-of-the-art performance, highlighting its adaptability across different feature types and data sizes. This innovative approach, empowered by the attention mechanism and the integration of interpolation, addresses the complex challenges of tabular data synthesis, establishing TTVAE as a powerful solution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Artificial Intelligence 工程技术-计算机：人工智能

CiteScore

11.20

自引率

1.40%

发文量

118

审稿时长

8 months

期刊介绍： The Journal of Artificial Intelligence (AIJ) welcomes papers covering a broad spectrum of AI topics, including cognition, automated reasoning, computer vision, machine learning, and more. Papers should demonstrate advancements in AI and propose innovative approaches to AI problems. Additionally, the journal accepts papers describing AI applications, focusing on how new methods enhance performance rather than reiterating conventional approaches. In addition to regular papers, AIJ also accepts Research Notes, Research Field Reviews, Position Papers, Book Reviews, and summary papers on AI challenges and competitions.