Xiaoyu Cai, Wei Zhang, Huiyun Li, Zhaohai Li, Aiyi Liu
{"title":"Estimation of receiver operating characteristic curve when case and control require different transformations for normality.","authors":"Xiaoyu Cai, Wei Zhang, Huiyun Li, Zhaohai Li, Aiyi Liu","doi":"10.1177/09622802251354921","DOIUrl":null,"url":null,"abstract":"<p><p>The receiver operating characteristic curve is a popular tool for evaluating the discriminative ability of a diagnostic biomarker. Parametric and nonparametric methods exist in the literature for estimation of a receiver operating characteristic curve and its associated summary measures using data usually collected from a case-control study. Since the receiver operating characteristic curve remains unchanged under a monotone transformation, the biomarker data from both cases (diseased subjects) and controls (non-diseased subjects) are often transformed based on a common Box-Cox transformation (or other appropriate transformation) prior to the application of a parametric estimation method. However, careful examination of the data often reveals that the biomarker values in the diseased and non-diseased population can only be normally approximated via different transformations. In this situation, existing estimation methods cannot be directly applied to the heterogeneously-transformed data. In this article, we deal with the situation that biomarker data from both diseased and non-diseased population are normally distributed after being transformed with different Box-Cox transformations. Under this assumption, we show that existing methods based on a common Box-Cox transformation are invalid in that they possess substantial biases. We move on to propose a method to estimate the underlying receiver operating characteristic curve and its area under the curve, and investigate its performance as compared to the nonparametric estimator that ignores any distributional assumptions as well as the estimators based on a common Box-Cox transformation assumptions. The method is exemplified with HIV infection data from the National Health and Nutrition Examination Survey (NHANES).</p>","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"9622802251354921"},"PeriodicalIF":1.6000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Methods in Medical Research","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1177/09622802251354921","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
Abstract
The receiver operating characteristic curve is a popular tool for evaluating the discriminative ability of a diagnostic biomarker. Parametric and nonparametric methods exist in the literature for estimation of a receiver operating characteristic curve and its associated summary measures using data usually collected from a case-control study. Since the receiver operating characteristic curve remains unchanged under a monotone transformation, the biomarker data from both cases (diseased subjects) and controls (non-diseased subjects) are often transformed based on a common Box-Cox transformation (or other appropriate transformation) prior to the application of a parametric estimation method. However, careful examination of the data often reveals that the biomarker values in the diseased and non-diseased population can only be normally approximated via different transformations. In this situation, existing estimation methods cannot be directly applied to the heterogeneously-transformed data. In this article, we deal with the situation that biomarker data from both diseased and non-diseased population are normally distributed after being transformed with different Box-Cox transformations. Under this assumption, we show that existing methods based on a common Box-Cox transformation are invalid in that they possess substantial biases. We move on to propose a method to estimate the underlying receiver operating characteristic curve and its area under the curve, and investigate its performance as compared to the nonparametric estimator that ignores any distributional assumptions as well as the estimators based on a common Box-Cox transformation assumptions. The method is exemplified with HIV infection data from the National Health and Nutrition Examination Survey (NHANES).
期刊介绍:
Statistical Methods in Medical Research is a peer reviewed scholarly journal and is the leading vehicle for articles in all the main areas of medical statistics and an essential reference for all medical statisticians. This unique journal is devoted solely to statistics and medicine and aims to keep professionals abreast of the many powerful statistical techniques now available to the medical profession. This journal is a member of the Committee on Publication Ethics (COPE)