2023 IEEE 39th International Conference on Data Engineering (ICDE)最新文献_第5页

Distribution-Regularized Federated Learning on Non-IID Data 非iid数据的分布正则化联邦学习

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00164

Yansheng Wang, Yongxin Tong, Zimu Zhou, Ruisheng Zhang, Sinno Jialin Pan, Lixin Fan, Qiang Yang

引用次数: 0

User-Defined Functions in Modern Data Engines 现代数据引擎中的用户定义函数

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00276

Ioannis Foufoulas, A. Simitsis

引用次数: 3

Discovering Editing Rules by Deep Reinforcement Learning 通过深度强化学习发现编辑规则

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00034

Yinan Mei, Shaoxu Song, Chenguang Fang, Ziheng Wei, Jingyun Fang, Jiang Long

{"title":"Discovering Editing Rules by Deep Reinforcement Learning","authors":"Yinan Mei, Shaoxu Song, Chenguang Fang, Ziheng Wei, Jingyun Fang, Jiang Long","doi":"10.1109/ICDE55515.2023.00034","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00034","url":null,"abstract":"Editing rules specify the conditions of applying high quality master data to repair low quality input data. Discovering editing rules, however, is challenging, since it considers not only the well curated master data but also the large-scale input data, an extremely large search space. A natural baseline, namely EnuMiner, costly enumerates the rules with possible conditions from both master and input data. Although several pruning strategies are enabled, the algorithm still takes a long time when the enumeration space is large. To avoid enumerating all candidate rules during mining, we argue to model the rule discovery process as a Markov Decision Process. Specifically, we discover editing rules by growing a rule tree where each node corresponds to a rule. The algorithm generates a new rule from the current node as a child node. We propose a reinforcement learning-based editing rule discovery algorithm, RLMiner, which trains an agent to wisely make decisions on branches when traversing the tree. Following the idea of evaluating rules, we design a reward function that is more in line with rule discovery scenarios and makes our algorithm perform effectively and efficiently. The experimental results show that our proposed RLMiner can mine high-utility editing rules like EnuMiner and scale well on the datasets with many attributes and large domains.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130680962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Exploring both Individuality and Cooperation for Air-Ground Spatial Crowdsourcing by Multi-Agent Deep Reinforcement Learning 基于多智能体深度强化学习的地空空间众包个性与协作研究

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00023

Yuxiao Ye, Chi Harold Liu, Zipeng Dai, Jianxin R. Zhao, Ye Yuan, Guoren Wang, Jian Tang

{"title":"Exploring both Individuality and Cooperation for Air-Ground Spatial Crowdsourcing by Multi-Agent Deep Reinforcement Learning","authors":"Yuxiao Ye, Chi Harold Liu, Zipeng Dai, Jianxin R. Zhao, Ye Yuan, Guoren Wang, Jian Tang","doi":"10.1109/ICDE55515.2023.00023","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00023","url":null,"abstract":"Spatial crowdsourcing (SC) has proven as a promising paradigm to employ human workers to collect data from diverse Point-of-Interests (PoIs) in a given area. Different from using human participants, we propose a novel air-ground SC scenario to fully take advantage of benefits brought by unmanned vehicles (UVs), including unmanned aerial vehicles (UAVs) with controllable high mobility and unmanned ground vehicles (UGVs) with abundant sensing resources. The objective is to maximize the amount of collected data, geographical fairness among all PoIs, and minimize the data loss and energy consumption, integrated as one single metric called \"efficiency\". We explicitly explore both individuality and cooperation natures of UAVs and UGVs by proposing a multi-agent deep reinforcement learning (MADRL) framework called \"h/i-MADRL\". Compatible with all multi-agent actor-critic methods, h/i-MADRL adds two novel plug-in modules: (a) h-CoPO, which models the cooperation preference among heterogeneous UAVs and UGVs; and (b) i-EOI, which extracts the UV’s individuality and encourages a better spatial division of work by adding intrinsic reward. Extensive experimental results on two real-world datasets on Purdue and NCSU campuses confirm that h/i-MADRL achieves a better exploration of both individuality and cooperation simultaneously, resulting in a better performance in terms of efficiency compared with five baselines.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128836271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning (Extended Abstract) 基于深度强化学习的成本敏感投资组合选择(扩展摘要)

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00312

Yifan Zhang, P. Zhao, Qingyao Wu, Bin Li, Junzhou Huang, Mingkui Tan

{"title":"Cost-Sensitive Portfolio Selection via Deep Reinforcement Learning (Extended Abstract)","authors":"Yifan Zhang, P. Zhao, Qingyao Wu, Bin Li, Junzhou Huang, Mingkui Tan","doi":"10.1109/ICDE55515.2023.00312","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00312","url":null,"abstract":"Portfolio Selection is an important real-world financial task and has attracted extensive attention in artificial intelligence communities. This task, however, has two main difficulties: (i) the non-stationary price series and complex asset correlations make the learning of feature representation very hard; (ii) the practicality principle in financial markets requires controlling both transaction and risk costs. Most existing methods adopt handcraft features and/or consider no constraints for the costs, which may make them perform unsatisfactorily and fail to control both costs in practice. In this paper, we propose a cost-sensitive portfolio selection method with deep reinforcement learning. Specifically, a novel two-stream portfolio policy network is devised to extract both price series patterns and asset correlations, while a new cost-sensitive reward function is developed to maximize the accumulated return and constrain both costs via reinforcement learning. We theoretically analyze the near-optimality of the proposed reward, which shows that the growth rate of the policy regarding this reward function can approach the theoretical optimum. We also empirically evaluate the proposed method on real-world datasets. Promising results demonstrate the effectiveness and superiority of the proposed method in terms of profitability, cost-sensitivity and representation abilities.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125523814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

ADAMANT: A Query Executor with Plug-In Interfaces for Easy Co-processor Integration ADAMANT：带有插件接口的查询执行器，可轻松集成协处理器

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00093

B. Gurumurthy, David Broneske, Gabriel Campero Durand, Thilo Pionteck, Gunter Saake

{"title":"ADAMANT: A Query Executor with Plug-In Interfaces for Easy Co-processor Integration","authors":"B. Gurumurthy, David Broneske, Gabriel Campero Durand, Thilo Pionteck, Gunter Saake","doi":"10.1109/ICDE55515.2023.00093","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00093","url":null,"abstract":"Today’s processor landscape is increasingly heterogeneous with the availability of co-processors. This landscape impacts query engines, as they need to be reworked to keep competitive performance by leveraging the underlying architectures. Such a rework might be costly if, for each external processor or SDK, peripheral components needed to be developed as well; resulting in redundant effort and adoption difficulties. In this paper, we propose an approach to overcome these shortcomings through ADAMANT – a query executor equipped with interfaces to plug-in new co-processors without reworking other components of a query engine. ADAMANT consists of 1) pluggable interfaces that allow interaction with co-processors, encapsulating operator implementations, and 2) a unified runtime that handles the execution on arbitrary co-processors, with a chunked execution model for scalable query processing. To evaluate ADAMANT’s versatility, we plug different implementations of a CPU/GPU-based system (using OpenCL, OpenMP, & CUDA) and analyze their performance on TPC-H queries. We identify a 4x performance difference between an arbitrary chunked execution vs. a more architecturally conscious pipelined execution. Furthermore, our comparisons with HeavyDB show complex performance variations from speed-ups up to a factor of 2x from our hardware-conscious execution. We envision initiatives like ADAMANT to ease the study of complex optimizations required in co-processor systems, paving the way for efficient and portable data management tools without cutbacks.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126628528","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Forecasting COVID-19 Dynamics: Clustering, Generalized Spatiotemporal Attention, and Impacts of Mobility and Geographic Proximity 预测COVID-19动态:聚类、广义时空关注以及流动性和地理邻近性的影响

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00221

Tong Shen, Yang Li, J. Moura

{"title":"Forecasting COVID-19 Dynamics: Clustering, Generalized Spatiotemporal Attention, and Impacts of Mobility and Geographic Proximity","authors":"Tong Shen, Yang Li, J. Moura","doi":"10.1109/ICDE55515.2023.00221","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00221","url":null,"abstract":"Forecasting the dynamics of COVID-19 enables government agencies and public health administrators to take proactive measures to combat the pandemic. This forecasting task faces several key challenges: First, the dynamics of COVID-19 exhibit complex spatial and temporal dependencies. The current growing trend at a location may be similar to that at another location in the past. Second, numerous factors, such as population mobility and geographic proximity between regions, mask usage, vaccine coverage, etc., significantly impact the dynamics. Third, we need to find the appropriate granularity for the forecasting task. The granularity should not be too coarse that we ignore the idiosyncrasies of individual regions. Still, the granularity should not be too fine that the prediction results are seriously vulnerable to noise.This paper addresses these challenges. We propose a simple but effective clustering algorithm that finds the appropriate granularity for the forecasting task. We invent generalized spatiotemporal attention, an attention mechanism that is generalized enough to capture the complex spatial and temporal dependencies and to flexibly account for intra- and inter-region characteristics such as geographic proximity and population mobility. Based on this generalized spatiotemporal attention, we designed COVID-Forecaster, a lightweight deep learning model for forecasting the dynamics of COVID-19. Experimental results demonstrate that COVID-Forecaster significantly outperforms state-of-the-art models. For example, COVID-Forecaster reduces the mean absolute percentage error (MAPE) by 6.8% and the weighted absolute percentage error (WAPE) by 13.5% in forecasting the COVID-19 dynamics at the 3141 counties of the United States.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"102 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123171160","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DBCatcher: A Cloud Database Online Anomaly Detection System based on Indicator Correlation DBCatcher:基于指标相关性的云数据库在线异常检测系统

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00091

Guangyu Zhang, Chun-hua Li, Ke Zhou, Li Liu, Ce Zhang, Wancheng Chen, Haotian Fang, Bin Cheng, Jie Yang, Jiashu Xing

{"title":"DBCatcher: A Cloud Database Online Anomaly Detection System based on Indicator Correlation","authors":"Guangyu Zhang, Chun-hua Li, Ke Zhou, Li Liu, Ce Zhang, Wancheng Chen, Haotian Fang, Bin Cheng, Jie Yang, Jiashu Xing","doi":"10.1109/ICDE55515.2023.00091","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00091","url":null,"abstract":"Anomaly detection system plays an important role in maintaining the stability of cloud database. Existing studies mainly focus on significant deviations in multivariate time series, such as a combination of CPU utilization, transactions per second, etc, to detect abnormal issues. Due to the complexity of cloud database structure and functions, these approaches are difficult to achieve a balance among detection performance, detection efficiency and workload adaptability. In this paper, we propose DBCatcher, a cloud database online anomaly detection system based on indicator correlation. Through extensive analysis of real-world cloud database time series, we find the correlations among trends in the same key performance indicators across databases within the same unit, which inspires us to explore a time series correlation measurement method that can efficiently detect abnormal issues. Meanwhile, we design a flexible time window observation mechanism and an adaptive threshold learning policy to minimize misjudgment caused by key performance indicator fluctuations, greatly enhancing the detection performance and workload adaptability. We conduct extensive experiments under real-world and synthetic workloads. Experimental results show that DBCatcher significantly improves the detection performance and detection efficiency compared to existing methods.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123271328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

RETIA: Relation-Entity Twin-Interact Aggregation for Temporal Knowledge Graph Extrapolation 时间知识图外推的关系-实体双交互聚合

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.00138

Kangzheng Liu, Feng Zhao, Guandong Xu, Xianzhi Wang, Hai Jin

{"title":"RETIA: Relation-Entity Twin-Interact Aggregation for Temporal Knowledge Graph Extrapolation","authors":"Kangzheng Liu, Feng Zhao, Guandong Xu, Xianzhi Wang, Hai Jin","doi":"10.1109/ICDE55515.2023.00138","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.00138","url":null,"abstract":"Temporal knowledge graph (TKG) extrapolation aims to predict future unknown events (facts) based on historical information, and has attracted considerable attention due to its great practical significance. Accurate representations (embeddings) of entities and relations form the basis of TKG extrapolation. Recent work has been devoted to improving the rationality of entity representations. However, on the one hand, ignoring relation modeling results in incomplete relation representations; therefore, some approaches aggregate only immediately adjacent entities of relations, but this can lead to the \"message islands\" problem of relation modeling. On the other hand, ignoring the association constraints between relations and entities can make the embeddings of both relations and entities prone to overfitting. To address the abovementioned challenges, we propose an advanced method, namely, RETIA. For the former issue, we generate twin hyperrelation subgraphs for each historical subgraph and then aggregate both the adjacent entities and relations in the hyperrelation subgraphs through a graph convolutional network (GCN). About the latter concern, we propose a twin-interact module (TIM), which provides communication channels for relation aggregation and entity aggregation during the evolution of the historical sequence. Experiments conducted on five public datasets show that RETIA has made great improvements across several evaluation metrics. Our released code is available at https://github.com/CGCL-codes/RETIA.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123120828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Multimodal Biological Knowledge Graph Completion via Triple Co-Attention Mechanism 通过三重共同注意机制完成多模态生物知识图谱

2023 IEEE 39th International Conference on Data Engineering (ICDE) Pub Date : 2023-04-01 DOI: 10.1109/ICDE55515.2023.10231041

Derong Xu, Jingbo Zhou, Tong Xu, Yuan Xia, Ji Liu, Enhong Chen, D. Dou

{"title":"Multimodal Biological Knowledge Graph Completion via Triple Co-Attention Mechanism","authors":"Derong Xu, Jingbo Zhou, Tong Xu, Yuan Xia, Ji Liu, Enhong Chen, D. Dou","doi":"10.1109/ICDE55515.2023.10231041","DOIUrl":"https://doi.org/10.1109/ICDE55515.2023.10231041","url":null,"abstract":"Biological Knowledge Graphs (BKGs) can help to model complex biological systems in a structural way to support various tasks. Nevertheless, the incompleteness problem may limit the performance of existing BKGs, which still deserves new methods to reveal the missing relations. Though great efforts have been made to knowledge graph completion, existing methods are not easy to be adapted to the multimodal biological information such as molecular structures and textual descriptions. To this end, we propose a novel co-attention-based multimodal embedding framework, named CamE, for the multimodal BKG completion task. Specifically, we design a Triple Co-Attention (TCA) operator to capture and highlight the same semantic features among different modalities. Based on TCA, we further propose two components to handle multimodal fusion and multimodal entity-relation interaction, respectively. One is the multimodal TCA fusion module to achieve a multimodal joint representation for each entity in the BKG. It aims to project different modal information into a common space by capturing the same semantic features and overcoming the modality gap. The other is the relation-aware interactive TCA module to learn interactive representation by modelling the deep interaction between multimodal entities and relations. Extensive experiments on two real-world multimodal BKG datasets demonstrate that our method significantly outperforms several state-of-the-art baselines, including 10.3% and 16.2% improvement w.r.t MRR and Hits@1 metrics over its best competitors on public DRKG-MM dataset.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122292652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1