{"title":"Valid instrumental variable selection method using negative control outcomes and constructing efficient estimator","authors":"Shunichiro Orihara, Atsushi Goto, Masataka Taguri","doi":"10.1002/bimj.202300113","DOIUrl":"10.1002/bimj.202300113","url":null,"abstract":"<p>In observational studies, instrumental variable (IV) methods are commonly applied when there are unmeasured covariates. In Mendelian randomization, constructing an allele score using many single nucleotide polymorphisms is often implemented; however, estimating biased causal effects by including some invalid IVs poses some risks. Invalid IVs are those IV candidates that are associated with unobserved variables. To solve this problem, we developed a novel strategy using negative control outcomes (NCOs) as auxiliary variables. Using NCOs, we are able to select only valid IVs and exclude invalid IVs without knowing which of the instruments are invalid. We also developed a new two-step estimation procedure and proved the semiparametric efficiency of our estimator. The performance of our proposed method was superior to some previous methods through simulations. Subsequently, we applied the proposed method to the UK Biobank dataset. Our results demonstrate that the use of an auxiliary variable, such as an NCO, enables the selection of valid IVs with assumptions different from those used in previous methods.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141155844","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lutecia Servius, Davide Pigoli, Joseph Ng, Franca Fraternali
{"title":"Predicting class switch recombination in B-cells from antibody repertoire data","authors":"Lutecia Servius, Davide Pigoli, Joseph Ng, Franca Fraternali","doi":"10.1002/bimj.202300171","DOIUrl":"10.1002/bimj.202300171","url":null,"abstract":"<p>Statistical and machine learning methods have proved useful in many areas of immunology. In this paper, we address for the first time the problem of predicting the occurrence of class switch recombination (CSR) in B-cells, a problem of interest in understanding antibody response under immunological challenges. We propose a framework to analyze antibody repertoire data, based on clonal (CG) group representation in a way that allows us to predict CSR events using CG level features as input. We assess and compare the performance of several predicting models (logistic regression, LASSO logistic regression, random forest, and support vector machine) in carrying out this task. The proposed approach can obtain an unweighted average recall of <span></span><math>\u0000 <semantics>\u0000 <mrow>\u0000 <mn>71</mn>\u0000 <mo>%</mo>\u0000 </mrow>\u0000 <annotation>$71%$</annotation>\u0000 </semantics></math> with models based on variable region descriptors and measures of CG diversity during an immune challenge and, most notably, before an immune challenge.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300171","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089535","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lucia Ameis, Oliver Kuss, Annika Hoyer, Kathrin Möllenhoff
{"title":"A nonparametric proportional risk model to assess a treatment effect in time-to-event data","authors":"Lucia Ameis, Oliver Kuss, Annika Hoyer, Kathrin Möllenhoff","doi":"10.1002/bimj.202300147","DOIUrl":"10.1002/bimj.202300147","url":null,"abstract":"<p>Time-to-event analysis often relies on prior parametric assumptions, or, if a semiparametric approach is chosen, Cox's model. This is inherently tied to the assumption of proportional hazards, with the analysis potentially invalidated if this assumption is not fulfilled. In addition, most interpretations focus on the hazard ratio, that is often misinterpreted as the relative risk (RR), the ratio of the cumulative distribution functions. In this paper, we introduce an alternative to current methodology for assessing a treatment effect in a two-group situation, not relying on the proportional hazards assumption but assuming proportional risks. Precisely, we propose a new nonparametric model to directly estimate the RR of two groups to experience an event under the assumption that the risk ratio is constant over time. In addition to this relative measure, our model allows for calculating the number needed to treat as an absolute measure, providing the possibility of an easy and holistic interpretation of the data. We demonstrate the validity of the approach by means of a simulation study and present an application to data from a large randomized controlled trial investigating the effect of dapagliflozin on all-cause mortality.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300147","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141089524","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Marta Sestelo, Luís Meira-Machado, Nora M. Villanueva, Javier Roca-Pardiñas
{"title":"A method for determining groups in cumulative incidence curves in competing risk data","authors":"Marta Sestelo, Luís Meira-Machado, Nora M. Villanueva, Javier Roca-Pardiñas","doi":"10.1002/bimj.202300084","DOIUrl":"10.1002/bimj.202300084","url":null,"abstract":"<p>The cumulative incidence function is the standard method for estimating the marginal probability of a given event in the presence of competing risks. One basic but important goal in the analysis of competing risk data is the comparison of these curves, for which limited literature exists. We proposed a new procedure that lets us not only test the equality of these curves but also group them if they are not equal. The proposed method allows determining the composition of the groups as well as an automatic selection of their number. Simulation studies show the good numerical behavior of the proposed methods for finite sample size. The applicability of the proposed method is illustrated using real data.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202300084","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141077094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gregor Buch, Andreas Schulz, Irene Schmidtmann, Konstantin Strauch, Philipp S. Wild
{"title":"Sparse Group Penalties for bi-level variable selection","authors":"Gregor Buch, Andreas Schulz, Irene Schmidtmann, Konstantin Strauch, Philipp S. Wild","doi":"10.1002/bimj.202200334","DOIUrl":"10.1002/bimj.202200334","url":null,"abstract":"<p>Many data sets exhibit a natural group structure due to contextual similarities or high correlations of variables, such as lipid markers that are interrelated based on biochemical principles. Knowledge of such groupings can be used through bi-level selection methods to identify relevant feature groups and highlight their predictive members. One of the best known approaches of this kind combines the classical <i>Least Absolute Shrinkage and Selection Operator</i> (LASSO) with the <i>Group LASSO</i>, resulting in the <i>Sparse Group LASSO</i>. We propose the Sparse Group Penalty (SGP) framework, which allows for a flexible combination of different SGL-style shrinkage conditions. Analogous to SGL, we investigated the combination of the <i>Smoothly Clipped Absolute Deviation</i> (SCAD), the <i>Minimax Concave Penalty</i> (MCP) and the <i>Exponential Penalty</i> (EP) with their group versions, resulting in the <i>Sparse Group SCAD</i>, the <i>Sparse Group MCP</i>, and the novel <i>Sparse Group EP</i> (SGE). Those shrinkage operators provide refined control of the effect of group formation on the selection process through a tuning parameter. In simulation studies, SGPs were compared with other bi-level selection methods (Group Bridge, composite MCP, and Group Exponential LASSO) for variable and group selection evaluated with the Matthews correlation coefficient. We demonstrated the advantages of the new SGE in identifying parsimonious models, but also identified scenarios that highlight the limitations of the approach. The performance of the techniques was further investigated in a real-world use case for the selection of regulated lipids in a randomized clinical trial.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200334","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140924034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Comparative review of novel model-assisted designs for phase I/II clinical trials","authors":"Haolun Shi, Ruitao Lin, Xiaolei Lin","doi":"10.1002/bimj.202300398","DOIUrl":"10.1002/bimj.202300398","url":null,"abstract":"<p>In recent years, both model-based and model-assisted designs have emerged to efficiently determine the optimal biological dose (OBD) in phase I/II trials for immunotherapy and targeted cellular agents. Model-based designs necessitate repeated model fitting and computationally intensive posterior sampling for each dose-assignment decision, limiting their practical application in real trials. On the other hand, model-assisted designs employ simple statistical models and facilitate the precalculation of a decision table for use throughout the trial, eliminating the need for repeated model fitting. Due to their simplicity and transparency, model-assisted designs are often preferred in phase I/II trials. In this paper, we systematically evaluate and compare the operating characteristics of several recent model-assisted phase I/II designs, including TEPI, PRINTE, Joint i3+3, BOIN-ET, STEIN, uTPI, and BOIN12, in addition to the well-known model-based EffTox design, using comprehensive numerical simulations. To ensure an unbiased comparison, we generated 10,000 dosing scenarios using a random scenario generation algorithm for each predetermined OBD location. We thoroughly assess various performance metrics, such as the selection percentages, average patient allocation to OBD, and overdose percentages across the eight designs. Based on these assessments, we offer design recommendations tailored to different objectives, sample sizes, and starting dose locations.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140913431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling tropical tuna shifts: An inflated power logit regression approach","authors":"Francisco F. Queiroz, Silvia L. P. Ferrari","doi":"10.1002/bimj.202300288","DOIUrl":"https://doi.org/10.1002/bimj.202300288","url":null,"abstract":"<p>We introduce a new class of zero-or-one inflated power logit (IPL) regression models, which serve as a versatile tool for analyzing bounded continuous data with observations at a boundary. These models are applied to explore the effects of climate changes on the distribution of tropical tuna within the North Atlantic Ocean. Our findings suggest that our modeling approach is adequate and capable of handling the outliers in the data. It exhibited superior performance compared to rival models in both diagnostic analysis and regarding the inference robustness. We offer a user-friendly method for fitting IPL regression models in practical applications.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 4","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140820550","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur
{"title":"Recoverability and estimation of causal effects under typical multivariable missingness mechanisms","authors":"Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur","doi":"10.1002/bimj.202200326","DOIUrl":"https://doi.org/10.1002/bimj.202200326","url":null,"abstract":"<p>In the context of missing data, the identifiability or “recoverability” of the average causal effect (ACE) depends not only on the usual causal assumptions but also on missingness assumptions that can be depicted by adding variable-specific missingness indicators to causal diagrams, creating missingness directed acyclic graphs (m-DAGs). Previous research described canonical m-DAGs, representing typical multivariable missingness mechanisms in epidemiological studies, and examined mathematically the recoverability of the ACE in each case. However, this work assumed no effect modification and did not investigate methods for estimation across such scenarios. Here, we extend this research by determining the recoverability of the ACE in settings with effect modification and conducting a simulation study to evaluate the performance of widely used missing data methods when estimating the ACE using correctly specified g-computation. Methods assessed were complete case analysis (CCA) and various implementations of multiple imputation (MI) with varying degrees of compatibility with the outcome model used in g-computation. Simulations were based on an example from the Victorian Adolescent Health Cohort Study (VAHCS), where interest was in estimating the ACE of adolescent cannabis use on mental health in young adulthood. We found that the ACE is recoverable when no incomplete variable (exposure, outcome, or confounder) causes its own missingness, and nonrecoverable otherwise, in simplified versions of 10 canonical m-DAGs that excluded unmeasured common causes of missingness indicators. Despite this nonrecoverability, simulations showed that MI approaches that are compatible with the outcome model in g-computation may enable approximately unbiased estimation across all canonical m-DAGs considered, except when the outcome causes its own missingness or causes the missingness of a variable that causes its own missingness. In the latter settings, researchers may need to consider sensitivity analysis methods incorporating external information (e.g., delta-adjustment methods). The VAHCS case study illustrates the practical implications of these findings.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200326","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On repeated diagnostic testing in screening for a medical condition: How often should the diagnostic test be repeated?","authors":"Patarawan Sangnawakij, Dankmar Böhning","doi":"10.1002/bimj.202300175","DOIUrl":"https://doi.org/10.1002/bimj.202300175","url":null,"abstract":"<p>In screening large populations a diagnostic test is frequently used repeatedly. An example is screening for bowel cancer using the fecal occult blood test (FOBT) on several occasions such as at 3 or 6 days. The question that is addressed here is how often should we repeat a diagnostic test when screening for a specific medical condition. Sensitivity is often used as a performance measure of a diagnostic test and is considered here for the individual application of the diagnostic test as well as for the overall screening procedure. The latter can involve an increasingly large number of repeated applications, but how many are sufficient? We demonstrate the issues involved in answering this question using real data on bowel cancer at St Vincents Hospital in Sydney. As data are only available for those testing positive at least once, an appropriate modeling technique is developed on the basis of the zero-truncated binomial distribution which allows for population heterogeneity. The latter is modeled using discrete nonparametric maximum likelihood. If we wish to achieve an overall sensitivity of 90%, the FOBT should be repeated for 2 weeks instead of the 1 week that was used at the time of the survey. A simulation study also shows consistency in the sense that bias and standard deviation for the estimated sensitivity decrease with an increasing number of repeated occasions as well as with increasing sample size.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 3","pages":""},"PeriodicalIF":1.7,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140619755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}