数据和决策——从奇怪到人为

Q4 Medicine
L. Marais
{"title":"数据和决策——从奇怪到人为","authors":"L. Marais","doi":"10.17159/2309-8309/2022/v21n2a0","DOIUrl":null,"url":null,"abstract":"With my term as Editor-in-Chief of the SAOJ coming to an end soon, I cannot help but reflect on some of my past experiences in this role. Perhaps the most challenging (and satisfying) was the need to get to grips with some of the more intricate aspects of research methodology and statistics. At first glance, these concepts seem fairly straightforward, but almost ubiquitously become exceedingly complex the harder you look. The odds ratio (OR) is an excellent case in point. There are a number of ways in which the measure of association between an exposure and an outcome can be expressed. ORs are probably the most commonly used. The current emphasis on reporting 95% confidence intervals (CI), rather than only p-values, has resulted in us seeing and doing a lot more logistic regression. Along with the 95% CI, the statistical program also provides the OR, which is then reported in our results. Now, ORs are tricky things. To justify this statement, I am going to have to go way back to the start, where all good research should start, with the definitions. A ratio is simply a number obtained by dividing one number by another number, and there is not necessarily a relationship between the numerator and denominator. A proportion is a ratio that relates a part to a whole, thus there is a relationship between the numerator and denominator. Rate is a proportion where the denominator also takes into account another dimension, typically time. Defining probability (P) is a minefield, but for our purposes, we will limit it to the measure of the likelihood that an event will occur. With the basics out of the way, let us delve a little deeper. Relative risk (RR), also known as the risk ratio, is a descriptive statistic commonly used in analytical studies. Risk can be defined as the probability of the outcome of interest occurring. RR is therefore essentially a ratio of proportions. In statistical terms, RR is equal to the event rate in the exposed group divided by the event rate in the non-exposed (control) group (Figure 1). For example, imagine we are performing a study comparing the risk of developing infection following grade III open fractures when antibiotics are given within an hour of the injury (treatment group) or not (control group). If 5 out of 100 patients in the treatment group and 20 out of 100 patients in the control group get an infection, we have a relative risk of 0.25. RR = 0.25 means exposed patients (i.e., in the treatment group) are 0.25 times as likely to develop the outcome of interest. We could also state that patients receiving antibiotics within an hour were 75% (0.75 = 1 − 0.25) less likely to develop infection. As clinicians we generally prefer to think in terms of probabilities and relative risk. The other commonly used descriptive statistic to report measure of association is the odds ratio (OR). Odds can be defined as the relative probability of the outcome of interest occurring. So, what is this probability relative to? – the probability of outcome not occurring. In other words, odds represent the ratio of the probability of the event occurring over the probability of the event not occurring. Odds can mathematically be defined as equal to (P/1−P). The OR then is a ratio of ratios and equal to odds of outcome in the exposed group divided by odds of outcome in the non-exposed control group. An OR < 1 means a reduced odds of the outcome of interest occurring while an OR > 1 implies increased odds. Thus, in our open fracture example study, the OR would be 0.21. This would mean that the odds (not risk) of infection is 79% lower in the group that received antibiotics. If an OR is 3.8, that would mean that odds of the outcome of interest occurring was increased by 3.8 times. For the sake of completeness, I will also mention number needed to treat (NNT), which is essentially the number of patients that need to receive the exposure to prevent one unwanted outcome. It is defined as the inverse of the absolute risk reduction (ARR). ARR is equal to event rate in the control group (CER) minus the event rate in the exposed group (EER). At this point, it might be useful to reflect on the origin of ORs. The first rationale has to do with study design. In cross-sectional studies, the RR can be calculated from the prevalence. In cohort studies RR can be calculated from the incidence. If the incidence or prevalence is not available in case-control studies, then OR may be the only option to provide an indication of the measure of association.1 It is important to remember that case-control studies are typically used to study rare diseases or events. Why this is relevant, will hopefully make more sense shortly. The second reason for ORs’ existence is statistical in nature and somewhat more complex. Basically, logistic regression provides an OR rather than RR, even in a cohort study, because of the frequency of convergence problems during the mathematical modelling.2 What is a convergence problem? The explanation is beyond the scope of this piece, and my understanding. It has something to do with the fact that regression aims to maximise the likelihood (by finding","PeriodicalId":32220,"journal":{"name":"SA Orthopaedic Journal","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Data and decision making – from odd to artificial\",\"authors\":\"L. Marais\",\"doi\":\"10.17159/2309-8309/2022/v21n2a0\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With my term as Editor-in-Chief of the SAOJ coming to an end soon, I cannot help but reflect on some of my past experiences in this role. Perhaps the most challenging (and satisfying) was the need to get to grips with some of the more intricate aspects of research methodology and statistics. At first glance, these concepts seem fairly straightforward, but almost ubiquitously become exceedingly complex the harder you look. The odds ratio (OR) is an excellent case in point. There are a number of ways in which the measure of association between an exposure and an outcome can be expressed. ORs are probably the most commonly used. The current emphasis on reporting 95% confidence intervals (CI), rather than only p-values, has resulted in us seeing and doing a lot more logistic regression. Along with the 95% CI, the statistical program also provides the OR, which is then reported in our results. Now, ORs are tricky things. To justify this statement, I am going to have to go way back to the start, where all good research should start, with the definitions. A ratio is simply a number obtained by dividing one number by another number, and there is not necessarily a relationship between the numerator and denominator. A proportion is a ratio that relates a part to a whole, thus there is a relationship between the numerator and denominator. Rate is a proportion where the denominator also takes into account another dimension, typically time. Defining probability (P) is a minefield, but for our purposes, we will limit it to the measure of the likelihood that an event will occur. With the basics out of the way, let us delve a little deeper. Relative risk (RR), also known as the risk ratio, is a descriptive statistic commonly used in analytical studies. Risk can be defined as the probability of the outcome of interest occurring. RR is therefore essentially a ratio of proportions. In statistical terms, RR is equal to the event rate in the exposed group divided by the event rate in the non-exposed (control) group (Figure 1). For example, imagine we are performing a study comparing the risk of developing infection following grade III open fractures when antibiotics are given within an hour of the injury (treatment group) or not (control group). If 5 out of 100 patients in the treatment group and 20 out of 100 patients in the control group get an infection, we have a relative risk of 0.25. RR = 0.25 means exposed patients (i.e., in the treatment group) are 0.25 times as likely to develop the outcome of interest. We could also state that patients receiving antibiotics within an hour were 75% (0.75 = 1 − 0.25) less likely to develop infection. As clinicians we generally prefer to think in terms of probabilities and relative risk. The other commonly used descriptive statistic to report measure of association is the odds ratio (OR). Odds can be defined as the relative probability of the outcome of interest occurring. So, what is this probability relative to? – the probability of outcome not occurring. In other words, odds represent the ratio of the probability of the event occurring over the probability of the event not occurring. Odds can mathematically be defined as equal to (P/1−P). The OR then is a ratio of ratios and equal to odds of outcome in the exposed group divided by odds of outcome in the non-exposed control group. An OR < 1 means a reduced odds of the outcome of interest occurring while an OR > 1 implies increased odds. Thus, in our open fracture example study, the OR would be 0.21. This would mean that the odds (not risk) of infection is 79% lower in the group that received antibiotics. If an OR is 3.8, that would mean that odds of the outcome of interest occurring was increased by 3.8 times. For the sake of completeness, I will also mention number needed to treat (NNT), which is essentially the number of patients that need to receive the exposure to prevent one unwanted outcome. It is defined as the inverse of the absolute risk reduction (ARR). ARR is equal to event rate in the control group (CER) minus the event rate in the exposed group (EER). At this point, it might be useful to reflect on the origin of ORs. The first rationale has to do with study design. In cross-sectional studies, the RR can be calculated from the prevalence. In cohort studies RR can be calculated from the incidence. If the incidence or prevalence is not available in case-control studies, then OR may be the only option to provide an indication of the measure of association.1 It is important to remember that case-control studies are typically used to study rare diseases or events. Why this is relevant, will hopefully make more sense shortly. The second reason for ORs’ existence is statistical in nature and somewhat more complex. Basically, logistic regression provides an OR rather than RR, even in a cohort study, because of the frequency of convergence problems during the mathematical modelling.2 What is a convergence problem? The explanation is beyond the scope of this piece, and my understanding. It has something to do with the fact that regression aims to maximise the likelihood (by finding\",\"PeriodicalId\":32220,\"journal\":{\"name\":\"SA Orthopaedic Journal\",\"volume\":\"1 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SA Orthopaedic Journal\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.17159/2309-8309/2022/v21n2a0\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SA Orthopaedic Journal","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.17159/2309-8309/2022/v21n2a0","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
引用次数: 0

摘要

随着我作为SAOJ总编辑的任期即将结束,我不禁回顾了我过去在这个角色上的一些经历。也许最具挑战性(也是最令人满意的)是需要掌握研究方法和统计的一些更复杂的方面。乍一看,这些概念似乎相当简单,但你越仔细看,它们几乎无处不在地变得极其复杂。比值比(OR)就是一个很好的例子。有许多方法可以表达暴露与结果之间的联系。or可能是最常用的。目前强调报告95%置信区间(CI),而不仅仅是p值,这导致我们看到并做了更多的逻辑回归。除了95% CI,统计程序还提供OR,然后在我们的结果中报告。现在,手术室是棘手的事情。为了证明这一说法是正确的,我将不得不回到起点,所有好的研究都应该从定义开始。比率只是一个数字除以另一个数字得到的数字,分子和分母之间不一定有关系。比例是部分与整体的比例,因此分子和分母之间存在关系。速率是一个比例,其分母还考虑了另一个维度,通常是时间。定义概率(P)是一个雷区,但出于我们的目的,我们将其限制为度量事件发生的可能性。有了这些基础知识,让我们深入研究一下。相对危险度(RR),又称风险比,是分析研究中常用的描述性统计量。风险可以定义为利息结果发生的概率。因此,RR本质上是一个比例比率。从统计学角度来看,RR等于暴露组的事件发生率除以未暴露组(对照组)的事件发生率(图1)。例如,假设我们正在进行一项研究,比较在受伤后一小时内给予抗生素(治疗组)或未给予抗生素(对照组)的III级开放性骨折发生感染的风险。如果治疗组100名患者中有5名感染,对照组100名患者中有20名感染,我们的相对风险为0.25。RR = 0.25意味着暴露患者(即治疗组)发生感兴趣结果的可能性是对照组的0.25倍。我们还可以说,在一小时内接受抗生素治疗的患者发生感染的可能性降低了75%(0.75 = 1 - 0.25)。作为临床医生,我们通常更喜欢从概率和相对风险的角度来思考。另一种常用的描述性统计方法是比值比(OR)。赔率可以定义为利益结果发生的相对概率。那么,这个概率相对于什么呢?——结果不发生的概率。换句话说,概率表示事件发生的概率与事件不发生的概率之比。概率在数学上可以定义为(P/1 - P)。OR是比值的比值等于受辐射组的结果概率除以未受辐射对照组的结果概率。OR < 1表示感兴趣的结果发生的几率降低,OR < 0 1表示几率增加。因此,在我们的开放性骨折示例研究中,OR为0.21。这意味着在接受抗生素治疗的人群中,感染的几率(而不是风险)降低了79%。如果OR是3.8,这意味着发生兴趣结果的几率增加了3.8倍。为了完整起见,我还将提到需要治疗的数量(NNT),它本质上是需要接受暴露以防止一种不想要的结果的患者数量。它被定义为绝对风险降低(ARR)的倒数。ARR等于对照组的事件发生率(CER)减去暴露组的事件发生率(EER)。在这一点上,反思ORs的起源可能是有用的。第一个基本原理与研究设计有关。在横断面研究中,RR可由患病率计算。在队列研究中,RR可通过发病率计算。如果在病例对照研究中无法获得发病率或患病率,则or可能是提供相关性测量指标的唯一选择重要的是要记住,病例对照研究通常用于研究罕见疾病或事件。为什么这是相关的,希望很快就会有更多的意义。ORs存在的第二个原因本质上是统计上的,而且有些复杂。基本上,即使在队列研究中,逻辑回归也提供了OR而不是RR,因为在数学建模过程中收敛问题的频率。 随着我作为SAOJ总编辑的任期即将结束,我不禁回顾了我过去在这个角色上的一些经历。也许最具挑战性(也是最令人满意的)是需要掌握研究方法和统计的一些更复杂的方面。乍一看,这些概念似乎相当简单,但你越仔细看,它们几乎无处不在地变得极其复杂。比值比(OR)就是一个很好的例子。有许多方法可以表达暴露与结果之间的联系。or可能是最常用的。目前强调报告95%置信区间(CI),而不仅仅是p值,这导致我们看到并做了更多的逻辑回归。除了95% CI,统计程序还提供OR,然后在我们的结果中报告。现在,手术室是棘手的事情。为了证明这一说法是正确的,我将不得不回到起点,所有好的研究都应该从定义开始。比率只是一个数字除以另一个数字得到的数字,分子和分母之间不一定有关系。比例是部分与整体的比例,因此分子和分母之间存在关系。速率是一个比例,其分母还考虑了另一个维度,通常是时间。定义概率(P)是一个雷区,但出于我们的目的,我们将其限制为度量事件发生的可能性。有了这些基础知识,让我们深入研究一下。相对危险度(RR),又称风险比,是分析研究中常用的描述性统计量。风险可以定义为利息结果发生的概率。因此,RR本质上是一个比例比率。从统计学角度来看,RR等于暴露组的事件发生率除以未暴露组(对照组)的事件发生率(图1)。例如,假设我们正在进行一项研究,比较在受伤后一小时内给予抗生素(治疗组)或未给予抗生素(对照组)的III级开放性骨折发生感染的风险。如果治疗组100名患者中有5名感染,对照组100名患者中有20名感染,我们的相对风险为0.25。RR = 0.25意味着暴露患者(即治疗组)发生感兴趣结果的可能性是对照组的0.25倍。我们还可以说,在一小时内接受抗生素治疗的患者发生感染的可能性降低了75%(0.75 = 1 - 0.25)。作为临床医生,我们通常更喜欢从概率和相对风险的角度来思考。另一种常用的描述性统计方法是比值比(OR)。赔率可以定义为利益结果发生的相对概率。那么,这个概率相对于什么呢?——结果不发生的概率。换句话说,概率表示事件发生的概率与事件不发生的概率之比。概率在数学上可以定义为(P/1 - P)。OR是比值的比值等于受辐射组的结果概率除以未受辐射对照组的结果概率。OR < 1表示感兴趣的结果发生的几率降低,OR < 0 1表示几率增加。因此,在我们的开放性骨折示例研究中,OR为0.21。这意味着在接受抗生素治疗的人群中,感染的几率(而不是风险)降低了79%。如果OR是3.8,这意味着发生兴趣结果的几率增加了3.8倍。为了完整起见,我还将提到需要治疗的数量(NNT),它本质上是需要接受暴露以防止一种不想要的结果的患者数量。它被定义为绝对风险降低(ARR)的倒数。ARR等于对照组的事件发生率(CER)减去暴露组的事件发生率(EER)。在这一点上,反思ORs的起源可能是有用的。第一个基本原理与研究设计有关。在横断面研究中,RR可由患病率计算。在队列研究中,RR可通过发病率计算。如果在病例对照研究中无法获得发病率或患病率,则or可能是提供相关性测量指标的唯一选择重要的是要记住,病例对照研究通常用于研究罕见疾病或事件。为什么这是相关的,希望很快就会有更多的意义。ORs存在的第二个原因本质上是统计上的,而且有些复杂。基本上,即使在队列研究中,逻辑回归也提供了OR而不是RR,因为在数学建模过程中收敛问题的频率。 什么是收敛问题?这个解释超出了本文的范围,也超出了我的理解范围。这与回归的目的是最大化可能性(通过发现)有关
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Data and decision making – from odd to artificial
With my term as Editor-in-Chief of the SAOJ coming to an end soon, I cannot help but reflect on some of my past experiences in this role. Perhaps the most challenging (and satisfying) was the need to get to grips with some of the more intricate aspects of research methodology and statistics. At first glance, these concepts seem fairly straightforward, but almost ubiquitously become exceedingly complex the harder you look. The odds ratio (OR) is an excellent case in point. There are a number of ways in which the measure of association between an exposure and an outcome can be expressed. ORs are probably the most commonly used. The current emphasis on reporting 95% confidence intervals (CI), rather than only p-values, has resulted in us seeing and doing a lot more logistic regression. Along with the 95% CI, the statistical program also provides the OR, which is then reported in our results. Now, ORs are tricky things. To justify this statement, I am going to have to go way back to the start, where all good research should start, with the definitions. A ratio is simply a number obtained by dividing one number by another number, and there is not necessarily a relationship between the numerator and denominator. A proportion is a ratio that relates a part to a whole, thus there is a relationship between the numerator and denominator. Rate is a proportion where the denominator also takes into account another dimension, typically time. Defining probability (P) is a minefield, but for our purposes, we will limit it to the measure of the likelihood that an event will occur. With the basics out of the way, let us delve a little deeper. Relative risk (RR), also known as the risk ratio, is a descriptive statistic commonly used in analytical studies. Risk can be defined as the probability of the outcome of interest occurring. RR is therefore essentially a ratio of proportions. In statistical terms, RR is equal to the event rate in the exposed group divided by the event rate in the non-exposed (control) group (Figure 1). For example, imagine we are performing a study comparing the risk of developing infection following grade III open fractures when antibiotics are given within an hour of the injury (treatment group) or not (control group). If 5 out of 100 patients in the treatment group and 20 out of 100 patients in the control group get an infection, we have a relative risk of 0.25. RR = 0.25 means exposed patients (i.e., in the treatment group) are 0.25 times as likely to develop the outcome of interest. We could also state that patients receiving antibiotics within an hour were 75% (0.75 = 1 − 0.25) less likely to develop infection. As clinicians we generally prefer to think in terms of probabilities and relative risk. The other commonly used descriptive statistic to report measure of association is the odds ratio (OR). Odds can be defined as the relative probability of the outcome of interest occurring. So, what is this probability relative to? – the probability of outcome not occurring. In other words, odds represent the ratio of the probability of the event occurring over the probability of the event not occurring. Odds can mathematically be defined as equal to (P/1−P). The OR then is a ratio of ratios and equal to odds of outcome in the exposed group divided by odds of outcome in the non-exposed control group. An OR < 1 means a reduced odds of the outcome of interest occurring while an OR > 1 implies increased odds. Thus, in our open fracture example study, the OR would be 0.21. This would mean that the odds (not risk) of infection is 79% lower in the group that received antibiotics. If an OR is 3.8, that would mean that odds of the outcome of interest occurring was increased by 3.8 times. For the sake of completeness, I will also mention number needed to treat (NNT), which is essentially the number of patients that need to receive the exposure to prevent one unwanted outcome. It is defined as the inverse of the absolute risk reduction (ARR). ARR is equal to event rate in the control group (CER) minus the event rate in the exposed group (EER). At this point, it might be useful to reflect on the origin of ORs. The first rationale has to do with study design. In cross-sectional studies, the RR can be calculated from the prevalence. In cohort studies RR can be calculated from the incidence. If the incidence or prevalence is not available in case-control studies, then OR may be the only option to provide an indication of the measure of association.1 It is important to remember that case-control studies are typically used to study rare diseases or events. Why this is relevant, will hopefully make more sense shortly. The second reason for ORs’ existence is statistical in nature and somewhat more complex. Basically, logistic regression provides an OR rather than RR, even in a cohort study, because of the frequency of convergence problems during the mathematical modelling.2 What is a convergence problem? The explanation is beyond the scope of this piece, and my understanding. It has something to do with the fact that regression aims to maximise the likelihood (by finding
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
SA Orthopaedic Journal
SA Orthopaedic Journal Medicine-Orthopedics and Sports Medicine
CiteScore
0.40
自引率
0.00%
发文量
17
审稿时长
6 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信