BMC Medical Research Methodology最新文献

筛选
英文 中文
Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-28 DOI: 10.1186/s12874-025-02473-w
Victoria Moglia, Owen Johnson, Gordon Cook, Marc de Kamps, Lesley Smith
{"title":"Artificial intelligence methods applied to longitudinal data from electronic health records for prediction of cancer: a scoping review.","authors":"Victoria Moglia, Owen Johnson, Gordon Cook, Marc de Kamps, Lesley Smith","doi":"10.1186/s12874-025-02473-w","DOIUrl":"10.1186/s12874-025-02473-w","url":null,"abstract":"<p><strong>Background: </strong>Early detection and diagnosis of cancer are vital to improving outcomes for patients. Artificial intelligence (AI) models have shown promise in the early detection and diagnosis of cancer, but there is limited evidence on methods that fully exploit the longitudinal data stored within electronic health records (EHRs). This review aims to summarise methods currently utilised for prediction of cancer from longitudinal data and provides recommendations on how such models should be developed.</p><p><strong>Methods: </strong>The review was conducted following PRISMA-ScR guidance. Six databases (MEDLINE, EMBASE, Web of Science, IEEE Xplore, PubMed and SCOPUS) were searched for relevant records published before 2/2/2024. Search terms related to the concepts \"artificial intelligence\", \"prediction\", \"health records\", \"longitudinal\", and \"cancer\". Data were extracted relating to several areas of the articles: (1) publication details, (2) study characteristics, (3) input data, (4) model characteristics, (4) reproducibility, and (5) quality assessment using the PROBAST tool. Models were evaluated against a framework for terminology relating to reporting of cancer detection and risk prediction models.</p><p><strong>Results: </strong>Of 653 records screened, 33 were included in the review; 10 predicted risk of cancer, 18 performed either cancer detection or early detection, 4 predicted recurrence, and 1 predicted metastasis. The most common cancers predicted in the studies were colorectal (n = 9) and pancreatic cancer (n = 9). 16 studies used feature engineering to represent temporal data, with the most common features representing trends. 18 used deep learning models which take a direct sequential input, most commonly recurrent neural networks, but also including convolutional neural networks and transformers. Prediction windows and lead times varied greatly between studies, even for models predicting the same cancer. High risk of bias was found in 90% of the studies. This risk was often introduced due to inappropriate study design (n = 26) and sample size (n = 26).</p><p><strong>Conclusion: </strong>This review highlights the breadth of approaches to cancer prediction from longitudinal data. We identify areas where reporting of methods could be improved, particularly regarding where in a patients' trajectory the model is applied. The review shows opportunities for further work, including comparison of these approaches and their applications in other cancers.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"24"},"PeriodicalIF":3.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773903/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143057941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scalable information extraction from free text electronic health records using large language models.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-28 DOI: 10.1186/s12874-025-02470-z
Bowen Gu, Vivian Shao, Ziqian Liao, Valentina Carducci, Santiago Romero Brufau, Jie Yang, Rishi J Desai
{"title":"Scalable information extraction from free text electronic health records using large language models.","authors":"Bowen Gu, Vivian Shao, Ziqian Liao, Valentina Carducci, Santiago Romero Brufau, Jie Yang, Rishi J Desai","doi":"10.1186/s12874-025-02470-z","DOIUrl":"10.1186/s12874-025-02470-z","url":null,"abstract":"<p><strong>Background: </strong>A vast amount of potentially useful information such as description of patient symptoms, family, and social history is recorded as free-text notes in electronic health records (EHRs) but is difficult to reliably extract at scale, limiting their utility in research. This study aims to assess whether an \"out of the box\" implementation of open-source large language models (LLMs) without any fine-tuning can accurately extract social determinants of health (SDoH) data from free-text clinical notes.</p><p><strong>Methods: </strong>We conducted a cross-sectional study using EHR data from the Mass General Brigham (MGB) system, analyzing free-text notes for SDoH information. We selected a random sample of 200 patients and manually labeled nine SDoH aspects. Eight advanced open-source LLMs were evaluated against a baseline pattern-matching model. Two human reviewers provided the manual labels, achieving 93% inter-annotator agreement. LLM performance was assessed using accuracy metrics for overall, mentioned, and non-mentioned SDoH, and macro F1 scores.</p><p><strong>Results: </strong>LLMs outperformed the baseline pattern-matching approach, particularly for explicitly mentioned SDoH, achieving up to 40% higher Accuracy<sub>mentioned</sub>. openchat_3.5 was the best-performing model, surpassing the baseline in overall accuracy across all nine SDoH aspects. The refined pipeline with prompt engineering reduced hallucinations and improved accuracy.</p><p><strong>Conclusions: </strong>Open-source LLMs are effective and scalable tools for extracting SDoH from unstructured EHRs, surpassing traditional pattern-matching methods. Further refinement and domain-specific training could enhance their utility in clinical research and predictive analytics, improving healthcare outcomes and addressing health disparities.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"23"},"PeriodicalIF":3.9,"publicationDate":"2025-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11773977/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Time and cost of linking administrative datasets for outcomes assessment in a follow-up study of participants from two randomised trials.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-27 DOI: 10.1186/s12874-025-02458-9
Mohammad Shahbaz, Jane E Harding, Barry Milne, Anthony Walters, Lisa Underwood, Martin von Randow, Lena Jacob, Greg D Gamble
{"title":"Time and cost of linking administrative datasets for outcomes assessment in a follow-up study of participants from two randomised trials.","authors":"Mohammad Shahbaz, Jane E Harding, Barry Milne, Anthony Walters, Lisa Underwood, Martin von Randow, Lena Jacob, Greg D Gamble","doi":"10.1186/s12874-025-02458-9","DOIUrl":"10.1186/s12874-025-02458-9","url":null,"abstract":"<p><strong>Background: </strong>For the follow-up of participants in randomised trials, data linkage is thought a more cost-efficient method for assessing outcomes. However, researchers often encounter technical and budgetary challenges. Data requests often require a significant amount of information from researchers, and can take several years to process. This study aimed to determine the feasibility, direct costs and the total time required to access administrative datasets for assessment of outcomes in a follow-up study of two randomised trials.</p><p><strong>Methods: </strong>We applied to access administrative datasets from New Zealand government agencies. All actions of study team members, along with their corresponding dates, were recorded prospectively for accessing data from each agency. Team members estimated the average time they spent on each action, and invoices from agencies were recorded. Additionally, we compared the estimated costs and time required for data linkage with those for obtaining self-reported questionnaires and conducting in-person assessments.</p><p><strong>Results: </strong>Eight agencies were approached to supply data, of which seven gave approval. The time from first enquiry to receiving an initial dataset ranged from 96 to 854 days. For 859 participants, the estimated time required to obtain outcome data from agencies was 1,530 min; to obtain completed self-reported questionnaires was 11,025 min; and to complete in-person assessments was 77,310 min. The estimated total costs were 20,827 NZD for data linkage, 11,735 NZD for self-reported questionnaires, and 116,085 NZD for in-person assessments. Using this data, we estimate that for a cohort of 100 participants, the costs would be similar for data linkage and in-person assessments. For a cohort of 5,000 participants, we estimate that costs would be similar for data linkage and questionnaires, but ten-fold higher for in-person assessments.</p><p><strong>Conclusions: </strong>Obtaining administrative datasets demands a substantial amount of time and effort. However, data linkage is a feasible method for outcome ascertainment in follow-up studies in New Zealand. For large cohorts, data linkage is likely to be less costly, whereas for small cohorts, in-person assessment has similar costs but is likely to be faster and allows direct assessment of outcomes.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"21"},"PeriodicalIF":3.9,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771019/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051440","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Penalized landmark supermodels (penLM) for dynamic prediction for time-to-event outcomes in high-dimensional data.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-27 DOI: 10.1186/s12874-024-02418-9
Anya H Fries, Eunji Choi, Summer S Han
{"title":"Penalized landmark supermodels (penLM) for dynamic prediction for time-to-event outcomes in high-dimensional data.","authors":"Anya H Fries, Eunji Choi, Summer S Han","doi":"10.1186/s12874-024-02418-9","DOIUrl":"10.1186/s12874-024-02418-9","url":null,"abstract":"<p><strong>Background: </strong>To effectively monitor long-term outcomes among cancer patients, it is critical to accurately assess patients' dynamic prognosis, which often involves utilizing multiple data sources (e.g., tumor registries, treatment histories, and patient-reported outcomes). However, challenges arise in selecting features to predict patient outcomes from high-dimensional data, aligning longitudinal measurements from multiple sources, and evaluating dynamic model performance.</p><p><strong>Methods: </strong>We provide a framework for dynamic risk prediction using the penalized landmark supermodel (penLM) and develop novel metrics ([Formula: see text] and [Formula: see text]) to evaluate and summarize model performance across different timepoints. Through simulations, we assess the coverage of the proposed metrics' confidence intervals under various scenarios. We applied penLM to predict the updated 5-year risk of lung cancer mortality at diagnosis and for subsequent years by combining data from SEER registries (2007-2018), Medicare claims (2007-2018), Medicare Health Outcome Survey (2006-2018), and U.S. Census (1990-2010).</p><p><strong>Results: </strong>The simulations confirmed valid coverage (~ 95%) of the confidence intervals of the proposed summary metrics. Of 4,670 lung cancer patients, 41.5% died from lung cancer. Using penLM, the key features to predict lung cancer mortality included long-term lung cancer treatments, minority races, regions with low education attainment or racial segregation, and various patient-reported outcomes beyond cancer staging and tumor characteristics. When evaluated using the proposed metrics, the penLM model developed using multi-source data ([Formula: see text]of 0.77 [95% confidence interval: 0.74-0.79]) outperformed those developed using single-source data ([Formula: see text]range: 0.50-0.74).</p><p><strong>Conclusions: </strong>The proposed penLM framework with novel evaluation metrics offers effective dynamic risk prediction when leveraging high-dimensional multi-source longitudinal data.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"22"},"PeriodicalIF":3.9,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771018/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051531","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SQUARE-IT: a proposed approach to square the identified research problem in the literature with the objectives, the appropriate clinical research question, and the research hypothesis.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-27 DOI: 10.1186/s12874-025-02468-7
Martin Alfuth, Jonas Klemp, Annette Schmidt, Lukas Streese, Nikolai Ramadanov, Robert Prill
{"title":"SQUARE-IT: a proposed approach to square the identified research problem in the literature with the objectives, the appropriate clinical research question, and the research hypothesis.","authors":"Martin Alfuth, Jonas Klemp, Annette Schmidt, Lukas Streese, Nikolai Ramadanov, Robert Prill","doi":"10.1186/s12874-025-02468-7","DOIUrl":"10.1186/s12874-025-02468-7","url":null,"abstract":"<p><p>The purpose of this article is to design and introduce the SQUARE-IT approach to help scientists and clinicians in research to align important research problems with the objectives, the appropriate clinical research questions to be answered, and the research hypotheses to be investigated in medical and therapeutic specialties. Research ideas can be generated primarily through simple methods such as brainstorming and mind mapping. However, transforming ideas into a valid research question is not as easy as it may seem, as the mere presence of an idea does not guarantee that the researcher has already uncovered existing knowledge on a particular topic or identified the actual research problem. Therefore, the SQUARE-IT items are developed, described, and critically discussed with reference to the scientific literature. They ask whether the identified research problem is 'Specific', 'Quantifiable', 'Usable', 'Accurate', 'Restricted', 'Eligible', 'Investigable', and 'Timely'. Before formulating the focused clinical question, SQUARE-IT can be used as a preparatory step to enable researchers to organize the relevant information that has been explored to date and to assess whether additional information is needed, thereby identifying current research gaps. In addition, it should facilitate the effectiveness and efficiency of evidence-based practice to ensure high quality patient care. Using SQUARE-IT as a framework, further elaboration of the approach and addition of other aspects are warranted to advance the discussion and improve methods of evidence-based practice in medical and therapeutic specialties for quality improvement of patient care.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"19"},"PeriodicalIF":3.9,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11770966/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Application of GUHA data mining method in cohort data to explore paths associated with premature death: a 29-year follow-up study.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-27 DOI: 10.1186/s12874-025-02477-6
Lily Nosraty, Esko Turunen, Saila Kyrönlahti, Clas-Håkan Nygård, Prakash Kc, Subas Neupane
{"title":"Application of GUHA data mining method in cohort data to explore paths associated with premature death: a 29-year follow-up study.","authors":"Lily Nosraty, Esko Turunen, Saila Kyrönlahti, Clas-Håkan Nygård, Prakash Kc, Subas Neupane","doi":"10.1186/s12874-025-02477-6","DOIUrl":"10.1186/s12874-025-02477-6","url":null,"abstract":"<p><strong>Background and method: </strong>This study set out to identify the factors and combinations of factors associated with the individual's premature death, using data from the Finnish Longitudinal Study on Ageing Municipal Employees (FLAME) which involved 6,257 participants over a 29-year follow-up period. Exact dates of death were obtained from the Finnish population register. Premature death was defined as a death occurring earlier than the age- and sex-specific actuarial life expectancy indicated by life tables for 1981, as the baseline, with the threshold period of nine months. Explanatory variables encompassed sociodemographic characteristics, health and functioning, health behaviors, subjective experiences, working conditions, and work abilities. Data were mined using the General Unary Hypothesis Automaton (GUHA) method, implemented with LISp-Miner software. GUHA involves an active dialogue between the user and the LISp-Miner software, with parameters tailored to the data and user interests. The parameters used are not absolute but depend on the data to be mined and the user's interests.</p><p><strong>Results: </strong>Over the follow-up period, 2,196 deaths were recorded, of which 70.4% were premature. Seven single factors and 67 sets of criteria (paths) were statistically significantly associated with premature mortality, passing the one-sided Fisher test. Single predicates of premature death included smoking, consuming alcohol a few times a month or once a week, poor self-rated fitness, incompetence to work and poor assured workability in two years' time, and diseases causing work disability. Notably, most of the factors selected as single predicates of premature mortality did not appear in the multi-predicate paths. Factors appearing in the paths were smoking more than 20 cigarettes a day, symptoms that impaired functioning, past smoking, absence of musculoskeletal diseases, poor self-rated health, having pain, male sex, being married, use of medication, more physical strain compared to others, and high life satisfaction, intention to retire due to reduced work ability caused by diseases and demanding work. Sex-specific analysis revealed similar findings.</p><p><strong>Conclusion: </strong>The findings indicate that associations between single predictors and premature mortality should be interpreted with caution, even when adjusted for a limited number of other factors. This highlights the complexity of premature mortality and the need for comprehensive models considering multiple interacting factors.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"20"},"PeriodicalIF":3.9,"publicationDate":"2025-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11771032/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143051529","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction of the cancer patients' database based on the US National Health and Nutrition Examination Survey (NHANES) datasets for cancer epidemiology research.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-24 DOI: 10.1186/s12874-025-02478-5
Jinyoung Moon, Yongseok Mun
{"title":"Construction of the cancer patients' database based on the US National Health and Nutrition Examination Survey (NHANES) datasets for cancer epidemiology research.","authors":"Jinyoung Moon, Yongseok Mun","doi":"10.1186/s12874-025-02478-5","DOIUrl":"10.1186/s12874-025-02478-5","url":null,"abstract":"<p><strong>Background: </strong>The US National Health and Nutrition Examination Survey (NHANES) dataset does not include a specific question or laboratory test to confirm a history of cancer diagnosis. However, if straightforward variables for cancer history are introduced, US NHANES could be effectively utilized in future cancer epidemiology studies. To address this gap, the authors developed a cancer patient database from the US NHANES datasets by employing multiple R programming codes.</p><p><strong>Methods: </strong>To illustrate the practical application of this methodology to a real-world problem, the authors extracted the R codes applied in an academic paper published in another journal on January 30th, 2024 ( https://doi.org/10.1016/j.heliyon.2024.e24337 ). This paper will focus on the construction of the database and analysis using R codes. Entire.</p><p><strong>Results: </strong>In the first example, the urine concentration of monocarboxynonyl phthalate, monocarboxyoctyl phthalate, mono-2-ethyl-5-carboxypentyl phthalate, and mono-2-hydroxy-iso-butyl phthalate (all ng/mL) were used as the independent variable, instead of the serum concentration of perfluorooctanoic acid (PFOA), perfluorooctane sulfonic acid (PFOS), perfluorohexane sulfonic acid (PFHxS), and perfluorononanoic acid (PFNA), respectively. In the second example, the serum concentration of 2,3,3',4,4'-Pentachlorobiphenyl (PCB105), 2,3,4,4´,5-Pentachlorobiphenyl (PCB114), 2,3',4,4',5-Pentachlorobiphenyl (PCB118), and 2,2',3,4,4',5'- and 2,3,3',4,4',6-Hexachlorobiphenyl (PCB138) were used as the independent variable, instead of the serum concentration of PFOA, PFOS, PFHxS, and PFNA, respectively.</p><p><strong>Discussion: </strong>This research offers a comprehensive set of R codes aimed at creating a single, user-friendly variable that encapsulates the history of each type of cancer while also considering the age at which the diagnosis was made. The US NHANES provides a wealth of critical data on environmental toxicant exposures. By employing these R codes, researchers can potentially discover numerous new associations between environmental toxicant exposures and cancer diagnoses. Ultimately, these codes could significantly advance the field of cancer epidemiology in relation to environmental toxicant exposure.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"17"},"PeriodicalIF":3.9,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11758729/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143036859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Framework for types of metainferences in mixed methods research.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-24 DOI: 10.1186/s12874-025-02475-8
Ahtisham Younas, Sergi Fàbregues, Sarah Munce, John W Creswell
{"title":"Framework for types of metainferences in mixed methods research.","authors":"Ahtisham Younas, Sergi Fàbregues, Sarah Munce, John W Creswell","doi":"10.1186/s12874-025-02475-8","DOIUrl":"10.1186/s12874-025-02475-8","url":null,"abstract":"<p><strong>Background: </strong>The generation of metainferences is a core and significant feature of mixed methods research. In recent years, there has been some discussion in the literature about criteria for appraising the quality of metainferences, the processes for generating them, and the critical role that assessing the \"fit\" of quantitative and qualitative data and results plays in this generative process. However, little is known about the types of insights that emerge from generating metainferences. To address this gap, this paper conceptualize and present the types and forms of metainferences that can be generated in MMR studies for guiding future research projects.</p><p><strong>Methods: </strong>A critical review of literature sources was conducted, including peer-reviewed articles, book chapters, and research reports. We performed a non-systematic literature search in the Scopus, Web of Science, Ovid, and Google Scholar databases using general phrases such as \"inferences in research\", \"metainferences in mixed methods\", \"inferences in mixed methods research\", and \"inference types\". Additional searches included key methodological journals, such as the Journal of Mixed Methods Research, International Journal of Multiple Research Approaches, Methodological Innovations, and the Sage Research Methods database, to locate books, chapters, and peer-reviewed articles that discussed inferences and metainferences.</p><p><strong>Results: </strong>We propose two broad types of metainferences and five sub-types. The broad metainferences are global and specific, and the subtypes include relational, predictive, causal, comparative, and elaborative metainferences. Furthermore, we provide examples of each type of metainference from published mixed methods empirical studies.</p><p><strong>Conclusions: </strong>This paper contributes to the field of mixed methods research by expanding the knowledge about metainferences and offering a practical framework of types of metainferences for mixed methods researchers and educators. The proposed framework offers an approach to identifying and recognizing types of metainferences in mixed methods research and serves as an opportunity for future discussion on the nature, insights, and characteristic features of metainferences within this methodology. By proposing a foundation for metainferences, our framework advances this critical area of mixed methods research.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"18"},"PeriodicalIF":3.9,"publicationDate":"2025-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11758751/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143036861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Analysis methods for covariate-constrained cluster randomized trials with time-to-event outcomes.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-22 DOI: 10.1186/s12874-025-02465-w
Amy M Crisp, M Elizabeth Halloran, Matt D T Hitchings, Ira M Longini, Natalie E Dean
{"title":"Analysis methods for covariate-constrained cluster randomized trials with time-to-event outcomes.","authors":"Amy M Crisp, M Elizabeth Halloran, Matt D T Hitchings, Ira M Longini, Natalie E Dean","doi":"10.1186/s12874-025-02465-w","DOIUrl":"10.1186/s12874-025-02465-w","url":null,"abstract":"<p><strong>Background: </strong>Cluster randomized trials, which often enroll a small number of clusters, can benefit from constrained randomization, selecting a final randomization scheme from a set of known, balanced randomizations. Previous literature has addressed the suitability of adjusting the analysis for the covariates that were balanced in the design phase when the outcome is continuous or binary. Here we extended this work to time-to-event outcomes by comparing two model-based tests and a newly derived permutation test. A current cluster randomized trial of vector control for the prevention of mosquito-borne disease in children in Mexico is used as a motivating example.</p><p><strong>Methods: </strong>We assessed type I error rates and power between simple randomization and constrained randomization using both prognostic and non-prognostic covariates via a simulation study. We compared the performance of a semi-parametric Cox proportional hazards model with robust variance, a mixed effects Cox model, and a permutation test utilizing deviance residuals.</p><p><strong>Results: </strong>The permutation test generally maintained nominal type I error-with the exception of the unadjusted analysis for constrained randomization-and also provided power comparable to the two Cox model-based tests. The model-based tests had inflated type I error when there were very few clusters per trial arm. All three methods performed well when there were 25 clusters per trial arm, as in the case of the motivating example.</p><p><strong>Conclusion: </strong>For time-to-event outcomes, covariate-constrained randomization was shown to improve power relative to simple randomization. The permutation test developed here was more robust to inflation of type I error compared to model-based tests. Gaining power by adjusting for covariates in the analysis phase was largely dependent on the number of clusters per trial arm.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"16"},"PeriodicalIF":3.9,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11753003/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143022122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The proper application of logistic regression model in complex survey data: a systematic review.
IF 3.9 3区 医学
BMC Medical Research Methodology Pub Date : 2025-01-22 DOI: 10.1186/s12874-024-02454-5
Devjit Dey, Md Samio Haque, Md Mojahedul Islam, Umme Iffat Aishi, Sajida Sultana Shammy, Md Sabbir Ahmed Mayen, Syed Toukir Ahmed Noor, Md Jamal Uddin
{"title":"The proper application of logistic regression model in complex survey data: a systematic review.","authors":"Devjit Dey, Md Samio Haque, Md Mojahedul Islam, Umme Iffat Aishi, Sajida Sultana Shammy, Md Sabbir Ahmed Mayen, Syed Toukir Ahmed Noor, Md Jamal Uddin","doi":"10.1186/s12874-024-02454-5","DOIUrl":"10.1186/s12874-024-02454-5","url":null,"abstract":"<p><strong>Background: </strong>Logistic regression is a useful statistical technique commonly used in many fields like healthcare, marketing, or finance to generate insights from binary outcomes (e.g., sick vs. not sick). However, when applying logistic regression to complex survey data, which includes complex sampling designs, specific methodological issues are often overlooked.</p><p><strong>Methods: </strong>The systematic review extensively searched the PubMed and ScienceDirect databases from January 2015 to December 2021, following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) 2020 guidelines, focusing primarily on the Demographic and Health Surveys (DHS) and Multiple Indicator Cluster Surveys (MICS). 810 articles met the inclusion criteria and were included in the analysis. When discussing logistic regression, the review considered multiple methodological problems such as the model adequacy assessment, handling dependence of observations, utilization of complex survey design, dealing with missing values, outliers, and more.</p><p><strong>Results: </strong>Among the selected articles, the DHS database was used the most (96%), with MICS accounting for only 3%, and both DHS and MICS accounting for 1%. Of these, it was found that only 19.7% of the studies employed multilevel mixed-effects logistic regression to account for data dependencies. Model validation techniques were not reported in 94.8% of the studies with limited uses of the bootstrap, jackknife, and other resampling methods. Moreover, sample weights, PSUs, and strata variables were used together in 40.4% of the articles, and 41.7% of the studies did not use any of these variables, which could have produced biased results. Goodness-of-fit assessments were not mentioned in 75.3% of the articles, and the Hosmer-Lemeshow and likelihood ratio test were the most common among those reported. Furthermore, 95.8% of studies did not mention outliers, and only 41.0% of studies corrected for missing information, while only 2.7% applied imputation techniques.</p><p><strong>Conclusions: </strong>This systematic review highlights important gaps in the use of logistic regression with complex survey data, such as overlooking data dependencies, survey design, and proper validation techniques, along with neglecting outliers, missing data, and goodness-of-fit assessments, all of which point to the need for clearer methodological standards and more thorough reporting to improve the reliability of results. Future research should focus on consistently following these standards to ensure stronger and more dependable findings.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"15"},"PeriodicalIF":3.9,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11752662/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143022127","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信