Methods of multicenter trials in psychiatry part I: Review

Kurt A Fischer-Cornelssen
{"title":"Methods of multicenter trials in psychiatry part I: Review","authors":"Kurt A Fischer-Cornelssen","doi":"10.1016/0364-7722(81)90096-5","DOIUrl":null,"url":null,"abstract":"<div><p>General</p><p>Fifty out of 100 publications on multicenter trials and 21 on methods are listed and partly discussed. Our discussions concentrate only on double blind, therapeutic trials. Multicenter trials are the only way, to keep sources of heterogeneity or error under control. Making high demands, the advantages of well-done multicenter trials surpass by far the disadvantages. </p><ul><li><span>1.</span><span><p>1. How to conduct a multicenter trial. </p><ul><li><span>1.1.</span><span><p>1.1 Introduction: In a trial the variabilities of interest are the efficacy and tolerance of drugs. Therefore all efforts have to be made to minimize all other non-treatment variances or to keep them constant. A multicenter trial has to be a logical, feasable, carefully planned and standardized sequence of events, carried out exactly as planned.</p></span></li><li><span>1.2.</span><span><p>1.2 Sample consideration: Heterogeneity of patients' diagnoses is an additional variance. Every variability beside the drug-effect will lower the success of the trial, the reliability and validity of the results. To prove or reject a hypothesis, the only target patients for an efficient proof are endogenous depressions (antidepressant), exacerbated paranoid schizophrenia (neuroleptic) and chronic anxiety states (minor tranquilizer) of medium to severe degree. Despite difficulties diagnostic labeling of patients (WHO-ICD 9) is an absolute necessity: symptoms are only meaningful in the context of the diagnosis concerned. Also the course of illness, past and present, is of importance (spontaneous remission during study). Statistical reasons require much higher sample sizes than are usually considered: with an <span><math><mtext>α</mtext></math></span>-risk of 5 %, a <span><math><mtext>β</mtext></math></span>-risk of 10 %, a 50 % expected improvement of a standard and a 60% improvement of a new drug, two times 422 patients are necessary to demonstrate a difference.</p></span></li><li><span>1.3.</span><span><p>1.3 Settings and investigators: Cross-study variabilities of numerous sources should be minimized as much as possible, concentrating on efficacy and tolerance of drugs: “prevention is better than cure”. The higher the qualification, experience and capability of investigators and nurses, the better will be the results and reliability.</p></span></li><li><span>1.4.</span><span><p>1.4. Experimental design: Clinical multicenter trials should be organized and conducted in a way which resembles ordinary clinical practice and being in the best interest of the (future) patients (Declaration of Helsinki/Tokyo). The design, the protocol, patient forms, execution and evaluation should be a master-piece of clarity. There is only one way to minimize undesired variability and deviations: standardization of every detail from the beginning to the end, considering logical thinking, reality, feasibility and practicability. One kind of standardization should never be attempted, a fixed dose of drugs. Treatment duration should not be less than 6 weeks (phase I and II).</p><p>Up to now nearly all rating scales frequently used have been based on psychological thinking. We need scales built on clinical, psychiatric and statistical experience, with high practicability and a broad spectrum of the most important, unbound symptoms. Interrater-reliability test data of scales are mostly related to highly trained raters under maximal conditions but not — which is necessary — to normal practical situations: each rater population has its own reliability. Of much more importance is the intrarater-reliability (consistency, stability) e.g. in untrained raters. Content-validity is dependent on the clinical/psychiatric input. Construct-validity can be demonstrated by factor- and cluster-analyses and can be proved by the clinical relevance of their results. But concurrent validity, opposing a new to a known rating scale is like comparing the blind and the deaf. A scale has to be compared to evident clinical parameters like global therapeutic efficacy.</p></span></li><li><span>1.5.</span><span><p>1.5. Execution and documentation: In this phase quality assurance maintains or enhances the reproducibility and validity of study findings. A good quality-consciousness and a quality assurance system should cover all aspects of the data generation and analysis process since missing data, sloppy data processing and analysis procedures can be at least as deleterious to end results as carelessness in the generation of data. This includes not only (basic data) dosage and efficacy data but also untoward effects, vital signs and laboratory testing.</p></span></li><li><span>1.6.</span><span><p>1.6. Analysis, synthesis and interpretation: The systematic involvement of a biostatistician from the very beginning (planning) to the end (statistical analysis) is a “sine qua non” for any clinical study, especially for a multicenter trial. The statistical advice helps to keep down all undesired variables and to obtain data which can, without loss, undergo statistical analysis.</p><p>“Obviously it is important to protect the investigation per se from the major hazard of inappropriate criteria, inappropriate samples, and the numerous sources of error and confounding in the treatment of the patient and in the collection of the data”. A multicenter trial should be stratified a priori (separate randomization of single studies) and be analyzed separately. Differences among hospitals and investigators should be identified (including interaction tests) before pooling of all data can be arranged. If a quantitative combination (pooling) of single study results is not possible, there are several methods for combining data qualitatively (For statistical methods see sourcebooks).</p><p>The interpretation (final report) has to present the single studies and the pooling separately. The main principle for this (between drugs, populations, studies and for the pooling) is the description of all details concerned: a) the initial (predrug) comparability; and b) the comparability in course of time (treatment period); as well as c) the final comparability.</p><p>The final report should reflect the standard procedure of the studies and multicenter trial in structure and content so as to facilitate the understanding and the comparability to practice. It should be possible in a well documented and well presented study to trace an individual patient's raw data through to it's contribution in arriving at a probability statement (computer outprints, tables and listings). The aim is a final monograph in which all the data of the studies and the multicenter trial are combined. Such monographs have a long standing value.</p></span></li></ul></span></li><li><span>2.</span><span><p>2. Conclusions: Open and double blind multicenter trials are the only way in psychopharmacological research to achieve the large sample sizes which are, for statistical reasons, needed. Prerequisite for meaningful statistical and clinical results is to minimize non-treatment variables. There is one way to succeed: uniform, standardized, strict and well controlled conditions and procedures, cooperation with a biostatistician, slavish attention to a myriad of details, quality consciousness and control from beginning to end. But we never should forget the primary goal of all our science and research: the benefit of the patient.</p></span></li></ul></div>","PeriodicalId":20801,"journal":{"name":"Progress in neuro-psychopharmacology","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1981-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1016/0364-7722(81)90096-5","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Progress in neuro-psychopharmacology","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/0364772281900965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

General

Fifty out of 100 publications on multicenter trials and 21 on methods are listed and partly discussed. Our discussions concentrate only on double blind, therapeutic trials. Multicenter trials are the only way, to keep sources of heterogeneity or error under control. Making high demands, the advantages of well-done multicenter trials surpass by far the disadvantages.

  • 1.

    1. How to conduct a multicenter trial.

    • 1.1.

      1.1 Introduction: In a trial the variabilities of interest are the efficacy and tolerance of drugs. Therefore all efforts have to be made to minimize all other non-treatment variances or to keep them constant. A multicenter trial has to be a logical, feasable, carefully planned and standardized sequence of events, carried out exactly as planned.

    • 1.2.

      1.2 Sample consideration: Heterogeneity of patients' diagnoses is an additional variance. Every variability beside the drug-effect will lower the success of the trial, the reliability and validity of the results. To prove or reject a hypothesis, the only target patients for an efficient proof are endogenous depressions (antidepressant), exacerbated paranoid schizophrenia (neuroleptic) and chronic anxiety states (minor tranquilizer) of medium to severe degree. Despite difficulties diagnostic labeling of patients (WHO-ICD 9) is an absolute necessity: symptoms are only meaningful in the context of the diagnosis concerned. Also the course of illness, past and present, is of importance (spontaneous remission during study). Statistical reasons require much higher sample sizes than are usually considered: with an α-risk of 5 %, a β-risk of 10 %, a 50 % expected improvement of a standard and a 60% improvement of a new drug, two times 422 patients are necessary to demonstrate a difference.

    • 1.3.

      1.3 Settings and investigators: Cross-study variabilities of numerous sources should be minimized as much as possible, concentrating on efficacy and tolerance of drugs: “prevention is better than cure”. The higher the qualification, experience and capability of investigators and nurses, the better will be the results and reliability.

    • 1.4.

      1.4. Experimental design: Clinical multicenter trials should be organized and conducted in a way which resembles ordinary clinical practice and being in the best interest of the (future) patients (Declaration of Helsinki/Tokyo). The design, the protocol, patient forms, execution and evaluation should be a master-piece of clarity. There is only one way to minimize undesired variability and deviations: standardization of every detail from the beginning to the end, considering logical thinking, reality, feasibility and practicability. One kind of standardization should never be attempted, a fixed dose of drugs. Treatment duration should not be less than 6 weeks (phase I and II).

      Up to now nearly all rating scales frequently used have been based on psychological thinking. We need scales built on clinical, psychiatric and statistical experience, with high practicability and a broad spectrum of the most important, unbound symptoms. Interrater-reliability test data of scales are mostly related to highly trained raters under maximal conditions but not — which is necessary — to normal practical situations: each rater population has its own reliability. Of much more importance is the intrarater-reliability (consistency, stability) e.g. in untrained raters. Content-validity is dependent on the clinical/psychiatric input. Construct-validity can be demonstrated by factor- and cluster-analyses and can be proved by the clinical relevance of their results. But concurrent validity, opposing a new to a known rating scale is like comparing the blind and the deaf. A scale has to be compared to evident clinical parameters like global therapeutic efficacy.

    • 1.5.

      1.5. Execution and documentation: In this phase quality assurance maintains or enhances the reproducibility and validity of study findings. A good quality-consciousness and a quality assurance system should cover all aspects of the data generation and analysis process since missing data, sloppy data processing and analysis procedures can be at least as deleterious to end results as carelessness in the generation of data. This includes not only (basic data) dosage and efficacy data but also untoward effects, vital signs and laboratory testing.

    • 1.6.

      1.6. Analysis, synthesis and interpretation: The systematic involvement of a biostatistician from the very beginning (planning) to the end (statistical analysis) is a “sine qua non” for any clinical study, especially for a multicenter trial. The statistical advice helps to keep down all undesired variables and to obtain data which can, without loss, undergo statistical analysis.

      “Obviously it is important to protect the investigation per se from the major hazard of inappropriate criteria, inappropriate samples, and the numerous sources of error and confounding in the treatment of the patient and in the collection of the data”. A multicenter trial should be stratified a priori (separate randomization of single studies) and be analyzed separately. Differences among hospitals and investigators should be identified (including interaction tests) before pooling of all data can be arranged. If a quantitative combination (pooling) of single study results is not possible, there are several methods for combining data qualitatively (For statistical methods see sourcebooks).

      The interpretation (final report) has to present the single studies and the pooling separately. The main principle for this (between drugs, populations, studies and for the pooling) is the description of all details concerned: a) the initial (predrug) comparability; and b) the comparability in course of time (treatment period); as well as c) the final comparability.

      The final report should reflect the standard procedure of the studies and multicenter trial in structure and content so as to facilitate the understanding and the comparability to practice. It should be possible in a well documented and well presented study to trace an individual patient's raw data through to it's contribution in arriving at a probability statement (computer outprints, tables and listings). The aim is a final monograph in which all the data of the studies and the multicenter trial are combined. Such monographs have a long standing value.

  • 2.

    2. Conclusions: Open and double blind multicenter trials are the only way in psychopharmacological research to achieve the large sample sizes which are, for statistical reasons, needed. Prerequisite for meaningful statistical and clinical results is to minimize non-treatment variables. There is one way to succeed: uniform, standardized, strict and well controlled conditions and procedures, cooperation with a biostatistician, slavish attention to a myriad of details, quality consciousness and control from beginning to end. But we never should forget the primary goal of all our science and research: the benefit of the patient.

精神病学多中心试验方法第一部分:综述
100篇关于多中心试验的出版物中有50篇和21篇关于方法的出版物被列出并部分讨论。我们的讨论只集中在双盲治疗试验上。多中心试验是控制异质性或误差来源的唯一方法。多中心试验要求高,做得好的优点远远超过缺点。1.1. 如何进行多中心试验。1.1.1.1简介:在试验中,关注的变量是药物的疗效和耐受性。因此,必须尽一切努力使所有其他非治疗差异最小化或保持不变。多中心试验必须是合乎逻辑的、可行的、精心计划的和标准化的事件序列,并完全按照计划进行。1.2.1.2样本考虑:患者诊断的异质性是一个额外的方差。除了药物效应之外,每一个可变性都会降低试验的成功率、结果的可靠性和有效性。为了证明或拒绝一个假设,有效证明的唯一目标患者是内源性抑郁症(抗抑郁药)、加重型偏执型精神分裂症(抗精神病药)和中度至重度慢性焦虑状态(轻度镇静剂)。尽管存在困难,但对患者进行诊断标签(WHO-ICD 9)是绝对必要的:症状只有在相关诊断的背景下才有意义。此外,过去和现在的病程也很重要(学习期间的自然缓解)。统计原因需要比通常考虑的大得多的样本量:α-风险为5%,β-风险为10%,标准预期改善50%,新药改善60%,需要两倍422例患者才能证明差异。1.3.1.3环境和研究人员:应尽可能减少众多来源的交叉研究变异性,重点关注药物的疗效和耐受性:“预防胜于治疗”。调查人员和护士的学历、经验和能力越高,结果越好,信度越高。实验设计:临床多中心试验应以类似于普通临床实践的方式组织和实施,并符合(未来)患者的最大利益(赫尔辛基/东京宣言)。设计、方案、病人表格、执行和评估都应该是清晰明了的杰作。只有一种方法可以最大限度地减少不希望的可变性和偏差:从开始到结束的每个细节标准化,考虑逻辑思维,现实,可行性和实用性。有一种标准化是绝对不应该尝试的,那就是固定剂量的药物。治疗时间不应少于6周(I期和II期)。到目前为止,几乎所有常用的评定量表都是基于心理学思维。我们需要建立在临床、精神病学和统计经验基础上的量表,具有高度的实用性和广泛的最重要的、不受限制的症状。量表的互信度测试数据大多与训练有素的评分员在最大条件下有关,而不是与正常的实际情况有关,因为每个评分员群体都有自己的信度。更重要的是内部信度(一致性、稳定性),例如在未经训练的评判员中。内容效度取决于临床/精神病学的输入。结构效度可以通过因子分析和聚类分析来证明,也可以通过其结果的临床相关性来证明。但同时效度,反对一个新的和已知的量表就像比较盲人和聋子。量表必须与明显的临床参数进行比较,如总体治疗效果。1.5.1.5。执行和记录:在此阶段,质量保证维持或提高研究结果的可重复性和有效性。良好的质量意识和质量保证体系应涵盖数据生成和分析过程的所有方面,因为缺失的数据、草率的数据处理和分析程序对最终结果的危害至少与数据生成过程中的粗心一样严重。这不仅包括(基础数据)剂量和疗效数据,还包括不良反应、生命体征和实验室检测。分析、综合和解释:生物统计学家从开始(计划)到结束(统计分析)的系统参与是任何临床研究的“必要条件”,特别是对于多中心试验。统计建议有助于控制所有不需要的变量,并获得可以不丢失的数据,进行统计分析。“显然,重要的是要保护调查本身免受不适当的标准、不适当的样本、以及在患者治疗和数据收集过程中大量错误和混淆来源的主要危害。” 多中心试验应进行先验分层(单个研究的单独随机化)并单独分析。在安排汇集所有数据之前,应确定医院和调查人员之间的差异(包括交互试验)。如果无法对单个研究结果进行定量组合(汇总),有几种方法可以对数据进行定性组合(有关统计方法,请参阅原始资料)。解释(最终报告)必须分别介绍单个研究和汇总研究。这(药物、人群、研究和汇集)的主要原则是对所有相关细节的描述:a)初始(药物前)可比性;b)时间过程(治疗期)的可比性;以及c)最后的可比性。最终报告应在结构和内容上反映研究和多中心试验的标准程序,以便于理解和实践的可比性。在一项记录良好、呈现良好的研究中,应该有可能追踪单个患者的原始数据,直到它对得出概率陈述(计算机输出图、表格和清单)的贡献。目的是最后的专著,其中所有的研究和多中心试验的数据被合并。这样的专著具有长期的价值。结论:开放和双盲多中心试验是精神药理学研究中实现大样本量的唯一途径,这是由于统计原因所需要的。获得有意义的统计和临床结果的前提是尽量减少非治疗变量。成功的方法只有一个:统一、标准化、严格和控制良好的条件和程序,与生物统计学家合作,对无数细节的盲目关注,从头到尾的质量意识和控制。但是我们永远不应该忘记我们所有科学和研究的首要目标:病人的利益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信