Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai
{"title":"TreeXformer:使用树状结构语义提取表格特征上下文信息","authors":"Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai","doi":"10.1016/j.ipm.2025.104291","DOIUrl":null,"url":null,"abstract":"<div><div>Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 6","pages":"Article 104291"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"TreeXformer: Extracting tabular feature-context information using tree-structured semantics\",\"authors\":\"Yinhong Li , Hanwen Qu , Chen Chen , Xiaoyi Lv , Enguang Zuo , Kui Wang , Xulun Cai\",\"doi\":\"10.1016/j.ipm.2025.104291\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 6\",\"pages\":\"Article 104291\"},\"PeriodicalIF\":6.9000,\"publicationDate\":\"2025-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325002328\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002328","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
TreeXformer: Extracting tabular feature-context information using tree-structured semantics
Tabular classification learning aims to support decision-making in fields such as finance and recommendation systems by processing various types of structured features in tabular data. Most existing models rely on the multi-layer non-linear structures of deep neural networks to automatically extract feature interactions. However, the heterogeneity of tabular features often leads to the neglect of feature-context information, resulting in redundant or insufficient interactions that degrade model performance. Enhancing the modeling of contextual relationships between features can improve the model’s ability to interpret heterogeneous features effectively. To address this, we propose the TreeXformer model, a customized Transformer network that introduces, for the first time, an abstract tree-structured semantic representation to capture feature-context information. We develop a Tree Graph Estimator (TGE) to construct the tree-structured semantics of features and employ the Guided Interaction Attention (GIA) to facilitate feature interactions. A mean operation is applied across feature dimensions to aggregate global semantic information, improving the model’s interpretability and enhancing the transparency of its decision-making process. Extensive experiments on five public datasets and one private dataset demonstrate that TreeXformer significantly improves model performance, proving its effectiveness and superiority in capturing complex feature relationships. Ultimately, TreeXformer not only enhances classification outcomes but also strengthens model interpretability.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.