A Hierarchical Spatial Transformer for Massive Point Samples in Continuous Space.

Advances in neural information processing systems Pub Date : 2023-12-01

Wenchong He, Zhe Jiang, Tingsong Xiao, Zelin Xu, Shigang Chen, Ronald Fick, Miles Medina, Christine Angelini

{"title":"A Hierarchical Spatial Transformer for Massive Point Samples in Continuous Space.","authors":"Wenchong He, Zhe Jiang, Tingsong Xiao, Zelin Xu, Shigang Chen, Ronald Fick, Miles Medina, Christine Angelini","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Transformers are widely used deep learning architectures. Existing transformers are mostly designed for sequences (texts or time series), images or videos, and graphs. This paper proposes a novel transformer model for massive (up to a million) point samples in continuous space. Such data are ubiquitous in environment sciences (e.g., sensor observations), numerical simulations (e.g., particle-laden flow, astrophysics), and location-based services (e.g., POIs and trajectories). However, designing a transformer for massive spatial points is non-trivial due to several challenges, including implicit long-range and multi-scale dependency on irregular points in continuous space, a non-uniform point distribution, the potential high computational costs of calculating all-pair attention across massive points, and the risks of over-confident predictions due to varying point density. To address these challenges, we propose a new hierarchical spatial transformer model, which includes multi-resolution representation learning within a quad-tree hierarchy and efficient spatial attention via coarse approximation. We also design an uncertainty quantification branch to estimate prediction confidence related to input feature noise and point sparsity. We provide a theoretical analysis of computational time complexity and memory costs. Extensive experiments on both real-world and synthetic datasets show that our method outperforms multiple baselines in prediction accuracy and our model can scale up to one million points on one NVIDIA A100 GPU. The code is available at https://github.com/spatialdatasciencegroup/HST.</p>","PeriodicalId":72099,"journal":{"name":"Advances in neural information processing systems","volume":"36 ","pages":"33365-33378"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11094554/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in neural information processing systems","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Transformers are widely used deep learning architectures. Existing transformers are mostly designed for sequences (texts or time series), images or videos, and graphs. This paper proposes a novel transformer model for massive (up to a million) point samples in continuous space. Such data are ubiquitous in environment sciences (e.g., sensor observations), numerical simulations (e.g., particle-laden flow, astrophysics), and location-based services (e.g., POIs and trajectories). However, designing a transformer for massive spatial points is non-trivial due to several challenges, including implicit long-range and multi-scale dependency on irregular points in continuous space, a non-uniform point distribution, the potential high computational costs of calculating all-pair attention across massive points, and the risks of over-confident predictions due to varying point density. To address these challenges, we propose a new hierarchical spatial transformer model, which includes multi-resolution representation learning within a quad-tree hierarchy and efficient spatial attention via coarse approximation. We also design an uncertainty quantification branch to estimate prediction confidence related to input feature noise and point sparsity. We provide a theoretical analysis of computational time complexity and memory costs. Extensive experiments on both real-world and synthetic datasets show that our method outperforms multiple baselines in prediction accuracy and our model can scale up to one million points on one NVIDIA A100 GPU. The code is available at https://github.com/spatialdatasciencegroup/HST.

本刊更多论文

连续空间中大量点样本的分层空间变换器

变换器是广泛使用的深度学习架构。现有的变换器大多是针对序列（文本或时间序列）、图像或视频以及图形设计的。本文针对连续空间中的海量（多达一百万）点样本提出了一种新型变换器模型。此类数据在环境科学（如传感器观测）、数值模拟（如粒子流、天体物理学）和基于位置的服务（如 POI 和轨迹）中无处不在。然而，为海量空间点设计转换器并非易事，这其中存在几个挑战，包括对连续空间中不规则点的隐式长程和多尺度依赖性、非均匀的点分布、计算海量点的全对注意力的潜在高计算成本，以及因点密度不同而导致的过度自信预测风险。为了应对这些挑战，我们提出了一种新的分层空间变换器模型，其中包括四叉树分层结构中的多分辨率表示学习，以及通过粗略近似实现的高效空间注意力。我们还设计了一个不确定性量化分支，以估计与输入特征噪声和点稀疏性相关的预测置信度。我们对计算时间复杂性和内存成本进行了理论分析。在真实世界和合成数据集上进行的大量实验表明，我们的方法在预测准确性上优于多种基线方法，而且我们的模型可以在一个英伟达 A100 GPU 上扩展到一百万个点。代码可在 https://github.com/spatialdatasciencegroup/HST 上获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Advances in neural information processing systems

自引率

0.00%

发文量