Wan-Ying Chang, Maryah Garner, Jodi Basner, Bruce Weinberg, Jason Owen-Smith
{"title":"A Linked Data Mosaic for Policy-Relevant Research on Science and Innovation: Value, Transparency, Rigor, and Community.","authors":"Wan-Ying Chang, Maryah Garner, Jodi Basner, Bruce Weinberg, Jason Owen-Smith","doi":"10.1162/99608f92.1e23fb3f","DOIUrl":"https://doi.org/10.1162/99608f92.1e23fb3f","url":null,"abstract":"<p><p>This article presents a new framework for realizing the value of linked data understood as a strategic asset and increasingly necessary form of infrastructure for policy-making and research in many domains. We outline a framework, the 'data mosaic' approach, which combines socio-organizational and technical aspects. After demonstrating the value of linked data, we highlight key concepts and dangers for community-developed data infrastructures. We concretize the framework in the context of work on science and innovation generally. Next we consider how a new partnership to link federal survey data, university data, and a range of public and proprietary data represents a concrete step toward building and sustaining a valuable data mosaic. We discuss technical issues surrounding linked data but emphasize that linking data involves addressing the varied concerns of wide-ranging data holders, including privacy, confidentiality, and security, as well as ensuring that all parties receive value from participating. The core of successful data mosaic projects, we contend, is as much institutional and organizational as it is technical. As such, sustained efforts to fully engage and develop diverse, innovative communities are essential.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":"4 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9616097/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9359947","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Quantitative Synthesis of Personalized Trials Studies: Meta-Analysis of Aggregated Data Versus Individual Patient Data.","authors":"Mariola Moeyaert, Joelle Fingerhut","doi":"10.1162/99608f92.3574f1dc","DOIUrl":"10.1162/99608f92.3574f1dc","url":null,"abstract":"<p><p>We have entered an era in which scientific knowledge and evidence increasingly inform research practice and policy. As there is an exponential increase in the use of personalized trials, there is a remarkable growing interest in the quantitative synthesis of personalized trials. One technique that is developed and can be applied for this purpose is meta-analysis. Meta-analysis involves the quantitative integration of effect sizes from several personalized trials. In this study, aggregated data (AD) and individual patient data (IPD) methods for meta-analysis of personalized trials are discussed, together with an empirical demonstration using a subset of a real meta-analytic data set. For the empirical demonstration, 26 personalized trials received usual care and yoga intervention in a randomized sequence. Results show a general consensus between the AD and IPD approach in terms of conclusions-that both usual care and the yoga intervention are effective in reducing pain. However, the IPD approach provides more information about the intervention effectiveness and intervention heterogeneity. IPD is a more flexible modeling approach, allowing for a variety of modeling options.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10673630/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46821882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"N-of-1 Trials, Their Reporting Guidelines, and the Advancement of Open Science Principles.","authors":"Antony Porcino, Sunita Vohra","doi":"10.1162/99608f92.a65a257a","DOIUrl":"10.1162/99608f92.a65a257a","url":null,"abstract":"<p><p>N-of-1 trials are multiple crossover trials done over time within a single person; they can also be done with a series of individuals. Their focus on the individual as the unit of analysis maintains statistical power while accommodating greater differences between patients than most standard clinical trials. This makes them particularly useful in rare diseases, while also being applicable across many health conditions and populations. Best practices recommend the use of reporting guidelines to publish research in a standardized and transparent fashion. N-of-1 trials have the SPIRIT extension for N-of-1 protocols (SPENT) and the CONSORT extension for N-of-1 trials (CENT). Open science is a recent movement focused on making scientific knowledge fully available to anyone, increasing collaboration, and sharing of scientific efforts. Open science goals increase research transparency, rigor, and reproducibility, and reduce research waste. Many organizations and articles focus on specific aspects of open science, for example, open access publishing. Throughout the trajectory of research (idea, development, running a trial, analysis, publication, dissemination, knowledge translation/reflection), many open science ideals are addressed by the individual-focused nature of N-of-1 trials, including issues such as patient perspectives in research development, personalization, and publications, enhanced equity from the broader inclusion criteria possible, and easier remote trials options. However, N-of-1 trials also help us understand areas of caution, such as monitoring of post hoc analyses and the nuances of confidentiality for rare diseases in open data sharing. The N-of-1 reporting guidelines encourage rigor and transparency of N-of-1 considerations for key aspects of the research trajectory.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10686313/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46777089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Naihua Duan, Daniel Norman, Christopher Schmid, Ida Sim, Richard L Kravitz
{"title":"Personalized Data Science and Personalized (N-of-1) Trials: Promising Paradigms for Individualized Health Care.","authors":"Naihua Duan, Daniel Norman, Christopher Schmid, Ida Sim, Richard L Kravitz","doi":"10.1162/99608f92.8439a336","DOIUrl":"10.1162/99608f92.8439a336","url":null,"abstract":"<p><p>The term 'data science' usually refers to the process of extracting value from <i>big data</i> obtained from a large group of individuals. An alternative rendition, which we call <i>personalized data science</i> (Per-DS), aims to collect, analyze, and interpret <i>personal data</i> to inform <i>personal</i> decisions. This article describes the main features of Per-DS, and reviews its current state and future outlook. A Per-DS investigation is of, by, and for an individual, the Per-DS investigator, acting simultaneously as her own <i>investigator</i>, <i>study participant</i>, and <i>beneficiary</i>, and making <i>personalized</i> decisions for study design and implementation. The scope of Per-DS studies may include systematic monitoring of physiological or behavioral patterns, case-crossover studies for symptom triggers, pre-post trials for exposure-outcome relationships, and personalized (N-of-1) trials for effectiveness. Per-DS studies produce <i>personal knowledge</i> generalizable to the individual's future self (thus benefiting herself) rather than knowledge generalizable to an external population (thus benefiting others). This endeavor requires a pivot from <i>data mining</i> or <i>extraction</i> to <i>data gardening</i>, analogous to home gardeners producing food for home consumption-the Per-DS investigator needs to '<i>cultivate the field'</i> by setting goals, specifying study design, identifying necessary data elements, and assembling instruments and tools for data collection. Then, she can implement the study protocol, harvest her personal data, and <i>mine</i> the data to <i>extract</i> personal knowledge. To facilitate Per-DS studies, Per-DS investigators need support from community-based, scientific, philanthropic, business, and government entities, to develop and deploy resources such as peer forums, mobile apps, 'virtual field guides,' and scientific and regulatory guidance.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10673628/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46992748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized Trial Ethics and Institutional Review Board Submissions.","authors":"Joyce P Samuel, Susan H Wootton","doi":"10.1162/99608f92.2ded0fc5","DOIUrl":"10.1162/99608f92.2ded0fc5","url":null,"abstract":"<p><p>The ethical and regulatory oversight of any clinical activity related to human subjects is commonly determined based on its categorization as either clinical practice or research. Prominent bioethicists have criticized the traditional distinctions used to delineate these categories, calling them counterproductive and outmoded, and arguing that learning and clinical practice should be deliberately and appropriately integrated. Personalized trials represent a clinical activity with characteristics that overlap both categories, making ethical and regulatory oversight requirements less straightforward. When the primary intent of the personalized trial is to assist in the conduct of individualized patient care with an emphasis on protecting the clinical decision from the biases inherent in usual clinical practice, how should this activity be regulated? In this article, we will explore the ethical underpinnings of personalized trials and propose various approaches to meeting regulatory requirements. Instead of imposing standard research regulations on the conduct of all personalized trials, we recommend that personalized trialists and IRB panels should consider whether participation in a personalized trial results in any foreseeable incremental increase in risk to the participant compared with usual care. This approach may reduce regulatory barriers, which could promote more widespread uptake of personalized trials.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10813651/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46521981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefani D'Angelo, Heejoon Ahn, Danielle Miller, Rachel Monane, Mark Butler
{"title":"Personalized Feedback for Personalized Trials: Construction of Summary Reports for Participants in a Series of Personalized Trials for Chronic Lower Back Pain.","authors":"Stefani D'Angelo, Heejoon Ahn, Danielle Miller, Rachel Monane, Mark Butler","doi":"10.1162/99608f92.d5b57784","DOIUrl":"10.1162/99608f92.d5b57784","url":null,"abstract":"<p><p>Personalized (N-of-1) trials offer a patient-centered research approach that can provide important clinical information for patients when selecting which treatment options best manage their chronic health concern. Researchers utilizing this approach should present trial results to patients in a clear and understandable manner in order for personalized research trials to be useful to participants. The current study provides participant feedback examples for personalized trial reports using lay summaries and multiple presentation styles from a series of 60 randomized personalized trials examining the effects of massage and yoga versus usual care on chronic lower back pain (CLBP). Researchers generated summary participant reports that describe individual participant results using multiple presentation modalities of data (e.g., visual, written, and auditory) to offer the most appealing style for various participants. The article discusses contents of the participant report as well as participant satisfaction with the personalized summary report, captured using a satisfaction survey administered after study completion. The results from the satisfaction survey in the current study show that participants were generally satisfied with their personalized summary report. Researchers will use feedback from the participants in the current study to refine personalized feedback reports for future studies.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10673635/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49414013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Accommodating Serial Correlation and Sequential Design Elements in Personalized Studies and Aggregated Personalized Studies.","authors":"Nicholas J Schork","doi":"10.1162/99608f92.f1eef6f4","DOIUrl":"10.1162/99608f92.f1eef6f4","url":null,"abstract":"<p><p>Single subject, or 'N-of-1,' studies are receiving a great deal of attention from both theoretical and applied researchers. This is consistent with the growing acceptance of 'personalized' approaches to health care and the need to prove that personalized interventions tailored to an individual's likely unique physiological profile and other characteristics work as they should. In fact, the preferred way of referring to N-of-1 studies in contemporary settings is as 'personalized studies.' Designing efficient personalized studies and analyzing data from them in ways that ensure statistically valid inferences are not trivial, however. I briefly discuss some of the more complex issues surrounding the design and analysis of personalized studies, such as the use of washout periods, the frequency with which measures associated with the efficacy of an intervention are collected during a study, and the serious effect that serial correlation can have on the analysis and interpretation of personalized study data and results if not accounted for explicitly. I point out that more efficient sequential designs for personalized and aggregated personalized studies can be developed, and I explore the properties of sequential personalized studies in a few settings via simulation studies. Finally, I comment on contexts within which personalized studies will likely be pursued in the future.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":"2022 SI3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10081537/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9283628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mark Butler, Stefani D'Angelo, Melissa Kaplan, Zarrin Tashnim, Danielle Miller, Heejoon Ahn, Louise Falzon, Andrew J Dominello, Cirrus Foroughi, Thevaa Chandereng, Ken Cheung, Karina Davidson
{"title":"A Series of Virtual Interventions for Chronic Lower Back Pain: A Feasibility Pilot Study for a Series of Personalized (N-of-1) Trials.","authors":"Mark Butler, Stefani D'Angelo, Melissa Kaplan, Zarrin Tashnim, Danielle Miller, Heejoon Ahn, Louise Falzon, Andrew J Dominello, Cirrus Foroughi, Thevaa Chandereng, Ken Cheung, Karina Davidson","doi":"10.1162/99608f92.72cd8432","DOIUrl":"10.1162/99608f92.72cd8432","url":null,"abstract":"<p><p>Chronic lower back pain (CLBP) affects 25% of U.S. adults and is associated with high costs due to physician visits and reduced productivity. Research shows that massage and yoga can be effective nonpharmacological treatments for CLBP, but the feasibility, scalability, individual treatment, and adverse-event heterogeneity of these treatments are unknown. The current study evaluated the feasibility and acceptability of a series of personalized (N-of-1) interventions for virtual delivery of massage and yoga or usual-care treatment for CLBP in 57 participants. We hypothesized that this study would provide valuable information about implementing a virtual, personalized platform for randomized controlled trials of personalized (N-of-1) interventions among individuals with CLBP. The study will do so by determining participants' ratings of usability and satisfaction with the virtual, personalized intervention delivery system and, in the long term, identifying ways to integrate these personalized trials into patient care. Of the 57 participants enrolled, two withdrew from the study and were not eligible to receive the primary outcome assessment. Thirty-seven of the remaining 55 participants (67.3%) completed satisfaction surveys comprising the System Usability Scale (SUS) and items assessing satisfaction with the components of the personalized trial. Participants rated the usability of the personalized trial as excellent (average SUS score = 85.8), 95% were satisfied with the personalized trial overall, and 100% stated they would recommend the trial to others. These results suggest that personalized trials of massage and yoga are highly feasible and acceptable to participants with CLBP.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":"4 SI3","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10443938/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10058491","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neal D Goldstein, Deborah Kahal, Karla Testa, Ed J Gracely, Igor Burstyn
{"title":"Data Quality in Electronic Health Record Research: An Approach for Validation and Quantitative Bias Analysis for Imperfectly Ascertained Health Outcomes Via Diagnostic Codes.","authors":"Neal D Goldstein, Deborah Kahal, Karla Testa, Ed J Gracely, Igor Burstyn","doi":"10.1162/99608f92.cbe67e91","DOIUrl":"https://doi.org/10.1162/99608f92.cbe67e91","url":null,"abstract":"<p><p>It is incumbent upon all researchers who use the electronic health record (EHR), including data scientists, to understand the quality of such data. EHR data may be subject to measurement error or misclassification that have the potential to bias results, unless one applies the available computational techniques specifically created for this problem. In this article, we begin with a discussion of data-quality issues in the EHR focusing on health outcomes. We review the concepts of sensitivity, specificity, positive and negative predictive values, and demonstrate how the imperfect classification of a dichotomous outcome variable can bias an analysis, both in terms of prevalence of the outcome, and relative risk of the outcome under one treatment regime (aka exposure) compared to another. This is then followed by a description of a generalizable approach to probabilistic (quantitative) bias analysis using a combination of regression estimation of the parameters that relate the true and observed data and application of these estimates to adjust the prevalence and relative risk that may have existed if there was no misclassification. We describe bias analysis that accounts for both random and systematic errors and highlight its limitations. We then motivate a case study with the goal of validating the accuracy of a health outcome, chronic infection with hepatitis C virus, derived from a diagnostic code in the EHR. Finally, we demonstrate our approaches on the case study and conclude by summarizing the literature on outcome misclassification and quantitative bias analysis.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9624477/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40450766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Bayesian Models for N-of-1 Trials.","authors":"Christopher Schmid, Jiabei Yang","doi":"10.1162/99608f92.3f1772ce","DOIUrl":"10.1162/99608f92.3f1772ce","url":null,"abstract":"<p><p>We describe Bayesian models for data from N-of-1 trials, reviewing both the basics of Bayesian inference and applications to data from single trials and collections of trials sharing the same research questions and data structures. Bayesian inference is natural for drawing inferences from N-of-1 trials because it can incorporate external and subjective information to supplement trial data as well as give straightforward interpretations of posterior probabilities as an individual's state of knowledge about their own condition after their trial. Bayesian models are also easily augmented to incorporate specific characteristics of N-of-1 data such as trend, carryover, and autocorrelation and offer flexibility of implementation. Combining data from multiple N-of-1 trials using Bayesian multilevel models leads naturally to inferences about population and subgroup parameters such as average treatment effects and treatment effect heterogeneity and to improved inferences about individual parameters. Data from a trial comparing different diets for treating children with inflammatory bowel disease are used to illustrate the models and inferences that may be drawn. The analysis shows that certain diets were better on average at reducing pain, but that benefits were restricted to a subset of patients and that withdrawal from the study was a good marker for lack of benefit.</p>","PeriodicalId":73195,"journal":{"name":"Harvard data science review","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10817775/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42815553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}