{"title":"Approximating decision trees with priority hypotheses","authors":"Jing Yuan , Shaojie Tang","doi":"10.1016/j.tcs.2024.114896","DOIUrl":null,"url":null,"abstract":"<div><div>This paper addresses the problem of creating decision trees for identifying hypotheses, also known as entities, in a setting where the cost of an action is dependent on the true hypothesis. Specifically, we consider the scenario where <em>n</em> hypotheses are divided into <em>m</em> groups based on their priority levels. Taking an action on a higher priority hypothesis incurs a higher cost. This is relevant to many real-world applications where cost-sensitive decisions need to be made. For example, in a medical diagnosis task, the goal is to take a series of actions (such as medical tests) to identify a cause. Each action in this process requires conducting a test on the patient and observing the outcome, which can take anywhere from a few minutes to several weeks depending on the test. In this case, the cost (the result of waiting for the outcome) is higher if the true hypothesis is more time-sensitive. For example, if the true hypothesis is toxic chemical exposure (as opposed to a chronic disease such as diabetes), a delay of a few minutes could significantly increase the patient's risk of mortality. We propose a group greedy algorithm to solve this problem. We demonstrate that under worst-case scenarios, our algorithm has an approximation ratio of <span><math><mi>O</mi><mo>(</mo><mi>m</mi><mi>log</mi><mo></mo><mi>n</mi><mo>)</mo></math></span>. Importantly, when <span><math><mi>m</mi><mo>=</mo><mn>1</mn></math></span>, meaning there is only one group of hypotheses, our result is consistent with the logarithmic approximation bound for the traditional optimal decision tree problem.</div></div>","PeriodicalId":49438,"journal":{"name":"Theoretical Computer Science","volume":"1023 ","pages":"Article 114896"},"PeriodicalIF":0.9000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical Computer Science","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304397524005139","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0
Abstract
This paper addresses the problem of creating decision trees for identifying hypotheses, also known as entities, in a setting where the cost of an action is dependent on the true hypothesis. Specifically, we consider the scenario where n hypotheses are divided into m groups based on their priority levels. Taking an action on a higher priority hypothesis incurs a higher cost. This is relevant to many real-world applications where cost-sensitive decisions need to be made. For example, in a medical diagnosis task, the goal is to take a series of actions (such as medical tests) to identify a cause. Each action in this process requires conducting a test on the patient and observing the outcome, which can take anywhere from a few minutes to several weeks depending on the test. In this case, the cost (the result of waiting for the outcome) is higher if the true hypothesis is more time-sensitive. For example, if the true hypothesis is toxic chemical exposure (as opposed to a chronic disease such as diabetes), a delay of a few minutes could significantly increase the patient's risk of mortality. We propose a group greedy algorithm to solve this problem. We demonstrate that under worst-case scenarios, our algorithm has an approximation ratio of . Importantly, when , meaning there is only one group of hypotheses, our result is consistent with the logarithmic approximation bound for the traditional optimal decision tree problem.
期刊介绍:
Theoretical Computer Science is mathematical and abstract in spirit, but it derives its motivation from practical and everyday computation. Its aim is to understand the nature of computation and, as a consequence of this understanding, provide more efficient methodologies. All papers introducing or studying mathematical, logic and formal concepts and methods are welcome, provided that their motivation is clearly drawn from the field of computing.