Distribution-free conformal joint prediction regions for neural marked temporal point processes

IF 2.9 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Machine Learning Pub Date : 2024-07-23 DOI:10.1007/s10994-024-06594-z

Victor Dheur, Tanguy Bosser, Rafael Izbicki, Souhaib Ben Taieb

{"title":"Distribution-free conformal joint prediction regions for neural marked temporal point processes","authors":"Victor Dheur, Tanguy Bosser, Rafael Izbicki, Souhaib Ben Taieb","doi":"10.1007/s10994-024-06594-z","DOIUrl":null,"url":null,"abstract":"<p>Sequences of labeled events observed at irregular intervals in continuous time are ubiquitous across various fields. Temporal Point Processes (TPPs) provide a mathematical framework for modeling these sequences, enabling inferences such as predicting the arrival time of future events and their associated label, called mark. However, due to model misspecification or lack of training data, these probabilistic models may provide a poor approximation of the true, unknown underlying process, with prediction regions extracted from them being unreliable estimates of the underlying uncertainty. This paper develops more reliable methods for uncertainty quantification in neural TPP models via the framework of conformal prediction. A primary objective is to generate a distribution-free joint prediction region for an event’s arrival time and mark, with a finite-sample marginal coverage guarantee. A key challenge is to handle both a strictly positive, continuous response and a categorical response, without distributional assumptions. We first consider a simple but overly conservative approach that combines individual prediction regions for the event’s arrival time and mark. Then, we introduce a more effective method based on bivariate highest density regions derived from the joint predictive density of arrival times and marks. By leveraging the dependencies between these two variables, this method excludes unlikely combinations of the two, resulting in sharper prediction regions while still attaining the pre-specified coverage level. We also explore the generation of individual univariate prediction regions for events’ arrival times and marks through conformal regression and classification techniques. Moreover, we evaluate the stronger notion of conditional coverage. Finally, through extensive experimentation on both simulated and real-world datasets, we assess the validity and efficiency of these methods.</p>","PeriodicalId":49900,"journal":{"name":"Machine Learning","volume":"31 1","pages":""},"PeriodicalIF":2.9000,"publicationDate":"2024-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Machine Learning","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s10994-024-06594-z","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sequences of labeled events observed at irregular intervals in continuous time are ubiquitous across various fields. Temporal Point Processes (TPPs) provide a mathematical framework for modeling these sequences, enabling inferences such as predicting the arrival time of future events and their associated label, called mark. However, due to model misspecification or lack of training data, these probabilistic models may provide a poor approximation of the true, unknown underlying process, with prediction regions extracted from them being unreliable estimates of the underlying uncertainty. This paper develops more reliable methods for uncertainty quantification in neural TPP models via the framework of conformal prediction. A primary objective is to generate a distribution-free joint prediction region for an event’s arrival time and mark, with a finite-sample marginal coverage guarantee. A key challenge is to handle both a strictly positive, continuous response and a categorical response, without distributional assumptions. We first consider a simple but overly conservative approach that combines individual prediction regions for the event’s arrival time and mark. Then, we introduce a more effective method based on bivariate highest density regions derived from the joint predictive density of arrival times and marks. By leveraging the dependencies between these two variables, this method excludes unlikely combinations of the two, resulting in sharper prediction regions while still attaining the pre-specified coverage level. We also explore the generation of individual univariate prediction regions for events’ arrival times and marks through conformal regression and classification techniques. Moreover, we evaluate the stronger notion of conditional coverage. Finally, through extensive experimentation on both simulated and real-world datasets, we assess the validity and efficiency of these methods.

Abstract Image

查看原文本刊更多论文

神经标记时点过程的无分布共形联合预测区域

在连续时间中以不规则间隔观察到的标记事件序列在各个领域无处不在。时点过程（TPPs）为这些序列建模提供了一个数学框架，可用于推断，如预测未来事件的到达时间及其相关标签（称为标记）。然而，由于模型规范错误或缺乏训练数据，这些概率模型可能无法很好地近似真实、未知的底层过程，从中提取的预测区域对底层不确定性的估计并不可靠。本文通过保形预测框架，为神经 TPP 模型的不确定性量化开发了更可靠的方法。其主要目标是为事件的到达时间和标记生成一个无分布的联合预测区域，并保证有限样本的边际覆盖。一个关键的挑战是如何在不考虑分布假设的情况下，同时处理严格的正向连续响应和分类响应。我们首先考虑了一种简单但过于保守的方法，即结合事件到达时间和标记的单个预测区域。然后，我们介绍了一种更有效的方法，它基于从到达时间和标记的联合预测密度得出的二元最高密度区域。通过利用这两个变量之间的依赖关系，该方法排除了这两个变量的不可能组合，从而产生了更清晰的预测区域，同时仍能达到预先指定的覆盖水平。我们还探索了通过保形回归和分类技术生成事件到达时间和标记的单变量预测区域。此外，我们还评估了更强的条件覆盖概念。最后，通过在模拟和真实世界数据集上进行大量实验，我们评估了这些方法的有效性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Machine Learning 工程技术-计算机：人工智能

CiteScore

11.00

自引率

2.70%

发文量

162

审稿时长

3 months

期刊介绍： Machine Learning serves as a global platform dedicated to computational approaches in learning. The journal reports substantial findings on diverse learning methods applied to various problems, offering support through empirical studies, theoretical analysis, or connections to psychological phenomena. It demonstrates the application of learning methods to solve significant problems and aims to enhance the conduct of machine learning research with a focus on verifiable and replicable evidence in published papers.