TreeXformer：使用树状结构语义提取表格特征上下文信息

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-07-14 DOI:10.1016/j.ipm.2025.104291

Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai

{"title":"TreeXformer：使用树状结构语义提取表格特征上下文信息","authors":"Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai","doi":"10.1016/j.ipm.2025.104291","DOIUrl":null,"url":null,"abstract":"<div><div>Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104291"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TreeXformer: Extracting tabular feature-context information using tree-structured semantics\",\"authors\":\"Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai\",\"doi\":\"10.1016/j.ipm.2025.104291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 6\",\"pages\":\"Article 104291\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325002328\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002328","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

表格分类学习旨在通过处理表格数据中的各种类型的结构化特征来支持金融和推荐系统等领域的决策。现有的模型大多依赖于深度神经网络的多层非线性结构来自动提取特征交互。然而，表格特征的异质性经常导致忽略特征上下文信息，导致冗余或不充分的交互，从而降低模型性能。增强特征之间上下文关系的建模可以提高模型有效解释异构特征的能力。为了解决这个问题，我们提出了TreeXformer模型，这是一个定制的Transformer网络，首次引入了抽象的树状结构语义表示来捕获特征上下文信息。我们开发了一个树图估计器（TGE）来构造特征的树状语义，并采用引导交互注意（GIA）来促进特征的交互。在特征维度上采用均值运算来聚合全局语义信息，提高了模型的可解释性，增强了决策过程的透明度。在5个公共数据集和1个私有数据集上的大量实验表明，TreeXformer显著提高了模型性能，证明了其在捕获复杂特征关系方面的有效性和优越性。最终，TreeXformer不仅增强了分类结果，还增强了模型的可解释性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

TreeXformer: Extracting tabular feature-context information using tree-structured semantics

Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.