Data Mining and Knowledge Discovery最新文献_第10页

Traffic forecasting on new roads using spatial contrastive pre-training (SCPT) 基于空间对比预训练(SCPT)的新道路交通预测

3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-27 DOI: 10.1007/s10618-023-00982-0

Arian Prabowo, Hao Xue, Wei Shao, Piotr Koniusz, Flora D. Salim

{"title":"Traffic forecasting on new roads using spatial contrastive pre-training (SCPT)","authors":"Arian Prabowo, Hao Xue, Wei Shao, Piotr Koniusz, Flora D. Salim","doi":"10.1007/s10618-023-00982-0","DOIUrl":"https://doi.org/10.1007/s10618-023-00982-0","url":null,"abstract":"Abstract New roads are being constructed all the time. However, the capabilities of previous deep forecasting models to generalize to new roads not seen in the training data (unseen roads) are rarely explored. In this paper, we introduce a novel setup called a spatio-temporal split to evaluate the models’ capabilities to generalize to unseen roads. In this setup, the models are trained on data from a sample of roads, but tested on roads not seen in the training data. Moreover, we also present a novel framework called Spatial Contrastive Pre-Training (SCPT) where we introduce a spatial encoder module to extract latent features from unseen roads during inference time. This spatial encoder is pre-trained using contrastive learning. During inference, the spatial encoder only requires two days of traffic data on the new roads and does not require any re-training. We also show that the output from the spatial encoder can be used effectively to infer latent node embeddings on unseen roads during inference time. The SCPT framework also incorporates a new layer, named the spatially gated addition layer, to effectively combine the latent features from the output of the spatial encoder to existing backbones. Additionally, since there is limited data on the unseen roads, we argue that it is better to decouple traffic signals to trivial-to-capture periodic signals and difficult-to-capture Markovian signals, and for the spatial encoder to only learn the Markovian signals. Finally, we empirically evaluated SCPT using the ST split setup on four real-world datasets. The results showed that adding SCPT to a backbone consistently improves forecasting performance on unseen roads. More importantly, the improvements are greater when forecasting further into the future. The codes are available on GitHub: https://github.com/cruiseresearchgroup/forecasting-on-new-roads .","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135535882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Can local explanation techniques explain linear additive models? 局部解释技术能解释线性加性模型吗?

3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-19 DOI: 10.1007/s10618-023-00971-3

Amir Hossein Akhavan Rahnama, Judith Bütepage, Pierre Geurts, Henrik Boström

{"title":"Can local explanation techniques explain linear additive models?","authors":"Amir Hossein Akhavan Rahnama, Judith Bütepage, Pierre Geurts, Henrik Boström","doi":"10.1007/s10618-023-00971-3","DOIUrl":"https://doi.org/10.1007/s10618-023-00971-3","url":null,"abstract":"Abstract Local model-agnostic additive explanation techniques decompose the predicted output of a black-box model into additive feature importance scores. Questions have been raised about the accuracy of the produced local additive explanations. We investigate this by studying whether some of the most popular explanation techniques can accurately explain the decisions of linear additive models. We show that even though the explanations generated by these techniques are linear additives, they can fail to provide accurate explanations when explaining linear additive models. In the experiments, we measure the accuracy of additive explanations, as produced by, e.g., LIME and SHAP, along with the non-additive explanations of Local Permutation Importance (LPI) when explaining Linear and Logistic Regression and Gaussian naive Bayes models over 40 tabular datasets. We also investigate the degree to which different factors, such as the number of numerical or categorical or correlated features, the predictive performance of the black-box model, explanation sample size, similarity metric, and the pre-processing technique used on the dataset can directly affect the accuracy of local explanations.","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135060685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Improving position encoding of transformers for multivariate time series classification 多变量时间序列分类中变压器位置编码的改进

3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-05 DOI: 10.1007/s10618-023-00948-2

Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi

{"title":"Improving position encoding of transformers for multivariate time series classification","authors":"Navid Mohammadi Foumani, Chang Wei Tan, Geoffrey I. Webb, Mahsa Salehi","doi":"10.1007/s10618-023-00948-2","DOIUrl":"https://doi.org/10.1007/s10618-023-00948-2","url":null,"abstract":"Abstract Transformers have demonstrated outstanding performance in many applications of deep learning. When applied to time series data, transformers require effective position encoding to capture the ordering of the time series data. The efficacy of position encoding in time series analysis is not well-studied and remains controversial, e.g., whether it is better to inject absolute position encoding or relative position encoding, or a combination of them. In order to clarify this, we first review existing absolute and relative position encoding methods when applied in time series classification. We then proposed a new absolute position encoding method dedicated to time series data called time Absolute Position Encoding (tAPE). Our new method incorporates the series length and input embedding dimension in absolute position encoding. Additionally, we propose computationally Efficient implementation of Relative Position Encoding (eRPE) to improve generalisability for time series. We then propose a novel multivariate time series classification model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data. The proposed absolute and relative position encoding methods are simple and efficient. They can be easily integrated into transformer blocks and used for downstream tasks such as forecasting, extrinsic regression, and anomaly detection. Extensive experiments on 32 multivariate time-series datasets show that our model is significantly more accurate than state-of-the-art convolution and transformer-based models. Code and models are open-sourced at https://github.com/Navidfoumani/ConvTran .","PeriodicalId":55183,"journal":{"name":"Data Mining and Knowledge Discovery","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135205529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Z-Time: efficient and effective interpretable multivariate time series classification Z-Time：高效有效的可解释多元时间序列分类

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-05 DOI: 10.1007/s10618-023-00969-x

Zed Lee, Tony Lindgren, P. Papapetrou

引用次数: 1

Studying bias in visual features through the lens of optimal transport 通过最优运输的视角研究视觉特征偏差

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-02 DOI: 10.1007/s10618-023-00972-2

Simone Fabbrizzi, Xuan Zhao, Emmanouil Krasanakis, Symeon Papadopoulos, Eirini Ntoutsi

引用次数: 1

Network embedding based on high-degree penalty and adaptive negative sampling 基于高度惩罚和自适应负采样的网络嵌入

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-09-02 DOI: 10.1007/s10618-023-00973-1

Gang-Feng Ma, Xu-Hua Yang, Wei Ye, Xinli Xu, Lei Ye

引用次数: 0

Improving neural network’s robustness on tabular data with D-layers 用d层提高神经网络对表格数据的鲁棒性

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-08-31 DOI: 10.1007/s10618-023-00965-1

Haiyang Xia, Nayyar Zaidi, Yishuo Zhang, Gang Li

引用次数: 0

Sky-signatures: detecting and characterizing recurrent behavior in sequential data 天空签名:在序列数据中检测和描述循环行为

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-08-29 DOI: 10.1007/s10618-023-00949-1

Clément Gautrais, Peggy Cellier, T. Guyet, R. Quiniou, A. Termier

引用次数: 0

SALτ: efficiently stopping TAR by improving priors estimates SALτ：通过改进先验估计有效地阻止TAR

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-08-28 DOI: 10.1007/s10618-023-00961-5

Alessio Molinari, Andrea Esuli

引用次数: 0

A semi-supervised interactive algorithm for change point detection 一种用于变点检测的半监督交互式算法

IF 4.8 3区计算机科学

Data Mining and Knowledge Discovery Pub Date : 2023-08-28 DOI: 10.1007/s10618-023-00974-0

Zhenxiang Cao, N. Seeuws, Marina De Vos, Alexander Bertrand

引用次数: 0