Approximating decision trees with priority hypotheses

IF 0.9 4区计算机科学 Q3 COMPUTER SCIENCE, THEORY & METHODS

Theoretical Computer Science Pub Date : 2024-10-09 DOI:10.1016/j.tcs.2024.114896

Jing Yuan , Shaojie Tang

{"title":"Approximating decision trees with priority hypotheses","authors":"Jing Yuan , Shaojie Tang","doi":"10.1016/j.tcs.2024.114896","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses the problem of creating decision trees for identifying hypotheses, also known as entities, in a setting where the cost of an action is dependent on the true hypothesis. Specifically, we consider the scenario where <em>n</em> hypotheses are divided into <em>m</em> groups based on their priority levels. Taking an action on a higher priority hypothesis incurs a higher cost. This is relevant to many real-world applications where cost-sensitive decisions need to be made. For example, in a medical diagnosis task, the goal is to take a series of actions (such as medical tests) to identify a cause. Each action in this process requires conducting a test on the patient and observing the outcome, which can take anywhere from a few minutes to several weeks depending on the test. In this case, the cost (the result of waiting for the outcome) is higher if the true hypothesis is more time-sensitive. For example, if the true hypothesis is toxic chemical exposure (as opposed to a chronic disease such as diabetes), a delay of a few minutes could significantly increase the patient's risk of mortality. We propose a group greedy algorithm to solve this problem. We demonstrate that under worst-case scenarios, our algorithm has an approximation ratio of <span><math><mi>O</mi><mo>(</mo><mi>m</mi><mi>log</mi><mo>⁡</mo><mi>n</mi><mo>)</mo></math></span>. Importantly, when <span><math><mi>m</mi><mo>=</mo><mn>1</mn></math></span>, meaning there is only one group of hypotheses, our result is consistent with the logarithmic approximation bound for the traditional optimal decision tree problem.</div></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1023 ","pages":"Article 114896"},"PeriodicalIF":0.9000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304397524005139","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

This paper addresses the problem of creating decision trees for identifying hypotheses, also known as entities, in a setting where the cost of an action is dependent on the true hypothesis. Specifically, we consider the scenario where n hypotheses are divided into m groups based on their priority levels. Taking an action on a higher priority hypothesis incurs a higher cost. This is relevant to many real-world applications where cost-sensitive decisions need to be made. For example, in a medical diagnosis task, the goal is to take a series of actions (such as medical tests) to identify a cause. Each action in this process requires conducting a test on the patient and observing the outcome, which can take anywhere from a few minutes to several weeks depending on the test. In this case, the cost (the result of waiting for the outcome) is higher if the true hypothesis is more time-sensitive. For example, if the true hypothesis is toxic chemical exposure (as opposed to a chronic disease such as diabetes), a delay of a few minutes could significantly increase the patient's risk of mortality. We propose a group greedy algorithm to solve this problem. We demonstrate that under worst-case scenarios, our algorithm has an approximation ratio of

O (m \log n)

. Importantly, when

m = 1

, meaning there is only one group of hypotheses, our result is consistent with the logarithmic approximation bound for the traditional optimal decision tree problem.

查看原文本刊更多论文

用优先假设逼近决策树

本文探讨了在行动成本取决于真实假设的情况下，创建决策树以识别假设（也称实体）的问题。具体来说，我们考虑了这样一种情况：n 个假设根据其优先级被分为 m 组。对优先级较高的假设采取行动会产生较高的成本。这与现实世界中许多需要做出成本敏感决策的应用相关。例如，在医疗诊断任务中，目标是采取一系列行动（如医学测试）来确定病因。这一过程中的每项行动都需要对病人进行测试并观察结果，根据测试内容的不同，测试时间从几分钟到几周不等。在这种情况下，如果真实假设对时间的敏感性更高，那么成本（等待结果的结果）就会更高。例如，如果真实假设是接触有毒化学物质（而不是糖尿病等慢性疾病），那么几分钟的延迟就会大大增加患者的死亡风险。我们提出了一种群体贪婪算法来解决这个问题。我们证明，在最坏的情况下，我们的算法的近似率为 O(mlogn)。重要的是，当 m=1 时，即只有一组假设，我们的结果与传统最优决策树问题的对数近似边界一致。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Theoretical Computer Science 工程技术-计算机：理论方法

CiteScore

2.60

自引率

18.20%

发文量

471

审稿时长

12.6 months

期刊介绍： Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.