{"title":"Diagnosis of a Single-Nucleotide Variant in Whole-Exome Sequencing Data for Patients With Inherited Diseases: Machine Learning Study Using Artificial Intelligence Variant Prioritization.","authors":"Yu-Shan Huang, Ching Hsu, Yu-Chang Chune, I-Cheng Liao, Hsin Wang, Yi-Lin Lin, Wuh-Liang Hwu, Ni-Chung Lee, Feipei Lai","doi":"10.2196/37701","DOIUrl":"10.2196/37701","url":null,"abstract":"<p><strong>Background: </strong>In recent years, thanks to the rapid development of next-generation sequencing (NGS) technology, an entire human genome can be sequenced in a short period. As a result, NGS technology is now being widely introduced into clinical diagnosis practice, especially for diagnosis of hereditary disorders. Although the exome data of single-nucleotide variant (SNV) can be generated using these approaches, processing the DNA sequence data of a patient requires multiple tools and complex bioinformatics pipelines.</p><p><strong>Objective: </strong>This study aims to assist physicians to automatically interpret the genetic variation information generated by NGS in a short period. To determine the true causal variants of a patient with genetic disease, currently, physicians often need to view numerous features on every variant manually and search for literature in different databases to understand the effect of genetic variation.</p><p><strong>Methods: </strong>We constructed a machine learning model for predicting disease-causing variants in exome data. We collected sequencing data from whole-exome sequencing (WES) and gene panel as training set, and then integrated variant annotations from multiple genetic databases for model training. The model built ranked SNVs and output the most possible disease-causing candidates. For model testing, we collected WES data from 108 patients with rare genetic disorders in National Taiwan University Hospital. We applied sequencing data and phenotypic information automatically extracted by a keyword extraction tool from patient's electronic medical records into our machine learning model.</p><p><strong>Results: </strong>We succeeded in locating 92.5% (124/134) of the causative variant in the top 10 ranking list among an average of 741 candidate variants per person after filtering. AI Variant Prioritizer was able to assign the target gene to the top rank for around 61.1% (66/108) of the patients, followed by Variant Prioritizer, which assigned it for 44.4% (48/108) of the patients. The cumulative rank result revealed that our AI Variant Prioritizer has the highest accuracy at ranks 1, 5, 10, and 20. It also shows that AI Variant Prioritizer presents better performance than other tools. After adopting the Human Phenotype Ontology (HPO) terms by looking up the databases, the top 10 ranking list can be increased to 93.5% (101/108).</p><p><strong>Conclusions: </strong>We successfully applied sequencing data from WES and free-text phenotypic information of patient's disease automatically extracted by the keyword extraction tool for model training and testing. By interpreting our model, we identified which features of variants are important. Besides, we achieved a satisfactory result on finding the target variant in our testing data set. After adopting the HPO terms by looking up the databases, the top 10 ranking list can be increased to 93.5% (101/108). The performance of the model is similar to that","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e37701"},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11168239/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45401615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ann Corneille Monahan, Sue S Feldman, Tony P Fitzgerald
{"title":"Reducing Crowding in Emergency Departments With Early Prediction of Hospital Admission of Adult Patients Using Biomarkers Collected at Triage: Retrospective Cohort Study.","authors":"Ann Corneille Monahan, Sue S Feldman, Tony P Fitzgerald","doi":"10.2196/38845","DOIUrl":"10.2196/38845","url":null,"abstract":"<p><strong>Background: </strong>Emergency department crowding continues to threaten patient safety and cause poor patient outcomes. Prior models designed to predict hospital admission have had biases. Predictive models that successfully estimate the probability of patient hospital admission would be useful in reducing or preventing emergency department \"boarding\" and hospital \"exit block\" and would reduce emergency department crowding by initiating earlier hospital admission and avoiding protracted bed procurement processes.</p><p><strong>Objective: </strong>To develop a model to predict imminent adult patient hospital admission from the emergency department early in the patient visit by utilizing existing clinical descriptors (ie, patient biomarkers) that are routinely collected at triage and captured in the hospital's electronic medical records. Biomarkers are advantageous for modeling due to their early and routine collection at triage; instantaneous availability; standardized definition, measurement, and interpretation; and their freedom from the confines of patient histories (ie, they are not affected by inaccurate patient reports on medical history, unavailable reports, or delayed report retrieval).</p><p><strong>Methods: </strong>This retrospective cohort study evaluated 1 year of consecutive data events among adult patients admitted to the emergency department and developed an algorithm that predicted which patients would require imminent hospital admission. Eight predictor variables were evaluated for their roles in the outcome of the patient emergency department visit. Logistic regression was used to model the study data.</p><p><strong>Results: </strong>The 8-predictor model included the following biomarkers: age, systolic blood pressure, diastolic blood pressure, heart rate, respiration rate, temperature, gender, and acuity level. The model used these biomarkers to identify emergency department patients who required hospital admission. Our model performed well, with good agreement between observed and predicted admissions, indicating a well-fitting and well-calibrated model that showed good ability to discriminate between patients who would and would not be admitted.</p><p><strong>Conclusions: </strong>This prediction model based on primary data identified emergency department patients with an increased risk of hospital admission. This actionable information can be used to improve patient care and hospital operations, especially by reducing emergency department crowding by looking ahead to predict which patients are likely to be admitted following triage, thereby providing needed information to initiate the complex admission and bed assignment processes much earlier in the care continuum.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e38845"},"PeriodicalIF":0.0,"publicationDate":"2022-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135233/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48343850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Seasonality of Hashimoto Thyroiditis: Infodemiology Study of Google Trends Data.","authors":"Robert Marcec, Josip Stjepanovic, Robert Likic","doi":"10.2196/38976","DOIUrl":"10.2196/38976","url":null,"abstract":"<p><strong>Background: </strong>Hashimoto thyroiditis (HT) is an autoimmune thyroid disease and the leading cause of hypothyroidism in areas with sufficient iodine intake. The quality-of-life impact and financial burden of hypothyroidism and HT highlight the need for additional research investigating the disease etiology with the aim of revealing potential modifiable risk factors.</p><p><strong>Objective: </strong>Implementation of measures against such risk factors, once identified, has the potential to lessen the financial burden while also improving the quality of life of many individuals. Therefore, we aimed to examine the potential seasonality of HT in Europe using the Google Trends data to explore whether there is a seasonal characteristic of Google searches regarding HT, examine the potential impact of the countries' geographic location on the potential seasonality, and identify potential modifiable risk factors for HT, thereby inspiring future research on the topic.</p><p><strong>Methods: </strong>Monthly Google Trends data on the search topic \"Hashimoto thyroiditis\" were retrieved in a 17-year time frame from January 2004 to December 2020 for 36 European countries. A cosinor model analysis was conducted to evaluate potential seasonality. Simple linear regression was used to estimate the potential effect of latitude and longitude on seasonal amplitude and phase of the model outputs.</p><p><strong>Results: </strong>Of 36 included European countries, significant seasonality was observed in 30 (83%) countries. Most phase peaks occurred in spring (14/30, 46.7%) and winter (8/30, 26.7%). A statistically significant effect was observed regarding the effect of geographical latitude on cosinor model amplitude (y = -3.23 + 0.13 x; R<sup>2</sup>=0.29; P=.002). Seasonal increases in HT search volume may therefore be a consequence of an increased incidence or higher disease activity. It is particularly interesting that in most countries, a seasonal peak occurred in spring and winter months; when viewed in the context of the statistically significant impact of geographical latitude on seasonality amplitude, this may indicate the potential role of vitamin D levels in the seasonality of HT.</p><p><strong>Conclusions: </strong>Significant seasonality of HT Google Trends search volume was observed in our study, with seasonal peaks in most countries occurring in spring and winter and with a significant impact of latitude on seasonality amplitude. Further studies on the topic of seasonality in HT and factors impacting it are required.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e38976"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135219/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49510453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Application of Machine Learning in Predicting Mortality Risk in Patients With Severe Femoral Neck Fractures: Prediction Model Development Study.","authors":"Lingxiao Xu, Jun Liu, Chunxia Han, Zisheng Ai","doi":"10.2196/38226","DOIUrl":"10.2196/38226","url":null,"abstract":"<p><strong>Background: </strong>Femoral neck fracture (FNF) accounts for approximately 3.58% of all fractures in the entire body, exhibiting an increasing trend each year. According to a survey, in 1990, the total number of hip fractures in men and women worldwide was approximately 338,000 and 917,000, respectively. In China, FNFs account for 48.22% of hip fractures. Currently, many studies have been conducted on postdischarge mortality and mortality risk in patients with FNF. However, there have been no definitive studies on in-hospital mortality or its influencing factors in patients with severe FNF admitted to the intensive care unit.</p><p><strong>Objective: </strong>In this paper, 3 machine learning methods were used to construct a nosocomial death prediction model for patients admitted to intensive care units to assist clinicians in early clinical decision-making.</p><p><strong>Methods: </strong>A retrospective analysis was conducted using information of a patient with FNF from the Medical Information Mart for Intensive Care III. After balancing the data set using the Synthetic Minority Oversampling Technique algorithm, patients were randomly separated into a 70% training set and a 30% testing set for the development and validation, respectively, of the prediction model. Random forest, extreme gradient boosting, and backpropagation neural network prediction models were constructed with nosocomial death as the outcome. Model performance was assessed using the area under the receiver operating characteristic curve, accuracy, precision, sensitivity, and specificity. The predictive value of the models was verified in comparison to the traditional logistic model.</p><p><strong>Results: </strong>A total of 366 patients with FNFs were selected, including 48 cases (13.1%) of in-hospital death. Data from 636 patients were obtained by balancing the data set with the in-hospital death group to survival group as 1:1. The 3 machine learning models exhibited high predictive accuracy, and the area under the receiver operating characteristic curve of the random forest, extreme gradient boosting, and backpropagation neural network were 0.98, 0.97, and 0.95, respectively, all with higher predictive performance than the traditional logistic regression model. Ranking the importance of the feature variables, the top 10 feature variables that were meaningful for predicting the risk of in-hospital death of patients were the Simplified Acute Physiology Score II, lactate, creatinine, gender, vitamin D, calcium, creatine kinase, creatine kinase isoenzyme, white blood cell, and age.</p><p><strong>Conclusions: </strong>Death risk assessment models constructed using machine learning have positive significance for predicting the in-hospital mortality of patients with severe disease and provide a valid basis for reducing in-hospital mortality and improving patient prognosis.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"1 1","pages":"e38226"},"PeriodicalIF":0.0,"publicationDate":"2022-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135225/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42491600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Frederik Skovbjerg, Helene Honoré, Inger Mechlenburg, Matthijs Lipperts, Rikke Gade, Erhard Trillingsgaard Næss-Schmidt
{"title":"Monitoring Physical Behavior in Rehabilitation Using a Machine Learning-Based Algorithm for Thigh-Mounted Accelerometers: Development and Validation Study.","authors":"Frederik Skovbjerg, Helene Honoré, Inger Mechlenburg, Matthijs Lipperts, Rikke Gade, Erhard Trillingsgaard Næss-Schmidt","doi":"10.2196/38512","DOIUrl":"10.2196/38512","url":null,"abstract":"<p><strong>Background: </strong>Physical activity is emerging as an outcome measure. Accelerometers have become an important tool in monitoring physical behavior, and newer analytical approaches of recognition methods increase the degree of details. Many studies have achieved high performance in the classification of physical behaviors through the use of multiple wearable sensors; however, multiple wearables can be impractical and lower compliance.</p><p><strong>Objective: </strong>The aim of this study was to develop and validate an algorithm for classifying several daily physical behaviors using a single thigh-mounted accelerometer and a supervised machine-learning scheme.</p><p><strong>Methods: </strong>We collected training data by adding the behavior classes-running, cycling, stair climbing, wheelchair ambulation, and vehicle driving-to an existing algorithm with the classes of sitting, lying, standing, walking, and transitioning. After combining the training data, we used a random forest learning scheme for model development. We validated the algorithm through a simulated free-living procedure using chest-mounted cameras for establishing the ground truth. Furthermore, we adjusted our algorithm and compared the performance with an existing algorithm based on vector thresholds.</p><p><strong>Results: </strong>We developed an algorithm to classify 11 physical behaviors relevant for rehabilitation. In the simulated free-living validation, the performance of the algorithm decreased to 57% as an average for the 11 classes (F-measure). After merging classes into sedentary behavior, standing, walking, running, and cycling, the result revealed high performance in comparison to both the ground truth and the existing algorithm.</p><p><strong>Conclusions: </strong>Using a single thigh-mounted accelerometer, we obtained high classification levels within specific behaviors. The behaviors classified with high levels of performance mostly occur in populations with higher levels of functioning. Further development should aim at describing behaviors within populations with lower levels of functioning.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e38512"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135216/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44711973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatemeh Ghafouri, Reza Ahangari Cohan, Hilda Samimi, Ali Hosseini Rad S M, Mahmood Naderi, Farshid Noorbakhsh, Vahid Haghpanah
{"title":"Development of a Multiepitope Vaccine Against SARS-CoV-2: Immunoinformatics Study.","authors":"Fatemeh Ghafouri, Reza Ahangari Cohan, Hilda Samimi, Ali Hosseini Rad S M, Mahmood Naderi, Farshid Noorbakhsh, Vahid Haghpanah","doi":"10.2196/36100","DOIUrl":"https://doi.org/10.2196/36100","url":null,"abstract":"Background Since the first appearance of SARS-CoV-2 in China in December 2019, the world witnessed the emergence of the SARS-CoV-2 outbreak. Due to the high transmissibility rate of the virus, there is an urgent need to design and develop vaccines against SARS-CoV-2 to prevent more cases affected by the virus. Objective A computational approach is proposed for vaccine design against the SARS-CoV-2 spike (S) protein, as the key target for neutralizing antibodies, and envelope (E) protein, which contains a conserved sequence feature. Methods We used previously reported epitopes of S protein detected experimentally and further identified a collection of predicted B-cell and major histocompatibility (MHC) class II–restricted T-cell epitopes derived from E proteins with an identical match to SARS-CoV-2 E protein. Results The in silico design of our candidate vaccine against the S and E proteins of SARS-CoV-2 demonstrated a high affinity to MHC class II molecules and effective results in immune response simulations. Conclusions Based on the results of this study, the multiepitope vaccine designed against the S and E proteins of SARS-CoV-2 may be considered as a new, safe, and efficient approach to combatting the COVID-19 pandemic.","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e36100"},"PeriodicalIF":0.0,"publicationDate":"2022-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9302570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40657989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Digital Phenotyping in Health Using Machine Learning Approaches: Scoping Review.","authors":"Schenelle Dayna Dlima, Santosh Shevade, Sonia Rebecca Menezes, Aakash Ganju","doi":"10.2196/39618","DOIUrl":"10.2196/39618","url":null,"abstract":"<p><strong>Background: </strong>Digital phenotyping is the real-time collection of individual-level active and passive data from users in naturalistic and free-living settings via personal digital devices, such as mobile phones and wearable devices. Given the novelty of research in this field, there is heterogeneity in the clinical use cases, types of data collected, modes of data collection, data analysis methods, and outcomes measured.</p><p><strong>Objective: </strong>The primary aim of this scoping review was to map the published research on digital phenotyping and to outline study characteristics, data collection and analysis methods, machine learning approaches, and future implications.</p><p><strong>Methods: </strong>We utilized an a priori approach for the literature search and data extraction and charting process, guided by the PRISMA-ScR (Preferred Reporting Items for Systematic Reviews and Meta-analyses Extension for Scoping Reviews). We identified relevant studies published in 2020, 2021, and 2022 on PubMed and Google Scholar using search terms related to digital phenotyping. The titles, abstracts, and keywords were screened during the first stage of the screening process, and the second stage involved screening the full texts of the shortlisted articles. We extracted and charted the descriptive characteristics of the final studies, which were countries of origin, study design, clinical areas, active and/or passive data collected, modes of data collection, data analysis approaches, and limitations.</p><p><strong>Results: </strong>A total of 454 articles on PubMed and Google Scholar were identified through search terms associated with digital phenotyping, and 46 articles were deemed eligible for inclusion in this scoping review. Most studies evaluated wearable data and originated from North America. The most dominant study design was observational, followed by randomized trials, and most studies focused on psychiatric disorders, mental health disorders, and neurological diseases. A total of 7 studies used machine learning approaches for data analysis, with random forest, logistic regression, and support vector machines being the most common.</p><p><strong>Conclusions: </strong>Our review provides foundational as well as application-oriented approaches toward digital phenotyping in health. Future work should focus on more prospective, longitudinal studies that include larger data sets from diverse populations, address privacy and ethical concerns around data collection from consumer technologies, and build \"digital phenotypes\" to personalize digital health interventions and treatment plans.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e39618"},"PeriodicalIF":0.0,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135220/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48140965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Analysis of Different Distance-Linkage Methods for Clustering Gene Expression Data and Observing Pleiotropy: Empirical Study.","authors":"Joydhriti Choudhury, Faisal Bin Ashraf","doi":"10.2196/30890","DOIUrl":"10.2196/30890","url":null,"abstract":"<p><strong>Background: </strong>Large amounts of biological data have been generated over the last few decades, encouraging scientists to look for connections between genes that cause various diseases. Clustering illustrates such a relationship between numerous species and genes. Finding an appropriate distance-linkage metric to construct clusters from diverse biological data sets has thus become critical. Pleiotropy is also important for a gene's expression to vary and create varied consequences in living things. Finding the pleiotropy of genes responsible for various diseases has become a major research challenge.</p><p><strong>Objective: </strong>Our goal was to establish the optimal distance-linkage strategy for creating reliable clusters from diverse data sets and identifying the common genes that cause various tumors to observe genes with pleiotropic effect.</p><p><strong>Methods: </strong>We considered 4 linking methods-single, complete, average, and ward-and 3 distance metrics-Euclidean, maximum, and Manhattan distance. For assessing the quality of different sets of clusters, we used a fitness function that combines silhouette width and within-cluster distance.</p><p><strong>Results: </strong>According to our findings, the maximum distance measure produces the highest-quality clusters. Moreover, for medium data set, the average linkage method, and for large data set, the ward linkage method works best. The outcome is not improved by using ensemble clustering. We also discovered genes that cause 3 different cancers and used gene enrichment to confirm our findings.</p><p><strong>Conclusions: </strong>Accuracy is crucial in clustering, and we investigated the accuracy of numerous clustering techniques in our research. Other studies may aid related works if the data set is similar to ours.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":" ","pages":"e30890"},"PeriodicalIF":0.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135218/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49517943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richa Singhal, Rachel Lukose, Gwenyth Carr, Afsoon Moktar, Ana Lucia Gonzales-Urday, Eric C Rouchka, Bathri N Vajravelu
{"title":"Differential Expression of Long Noncoding RNAs in Murine Myoblasts After Short Hairpin RNA-Mediated Dysferlin Silencing In Vitro: Microarray Profiling.","authors":"Richa Singhal, Rachel Lukose, Gwenyth Carr, Afsoon Moktar, Ana Lucia Gonzales-Urday, Eric C Rouchka, Bathri N Vajravelu","doi":"10.2196/33186","DOIUrl":"10.2196/33186","url":null,"abstract":"<p><strong>Background: </strong>Long noncoding RNAs (lncRNAs) are noncoding RNA transcripts greater than 200 nucleotides in length and are known to play a role in regulating the transcription of genes involved in vital cellular functions. We hypothesized the disease process in dysferlinopathy is linked to an aberrant expression of lncRNAs and messenger RNAs (mRNAs).</p><p><strong>Objective: </strong>In this study, we compared the lncRNA and mRNA expression profiles between wild-type and dysferlin-deficient murine myoblasts (C2C12 cells).</p><p><strong>Methods: </strong>LncRNA and mRNA expression profiling were performed using a microarray. Several lncRNAs with differential expression were validated using quantitative real-time polymerase chain reaction. Gene Ontology (GO) analysis was performed to understand the functional role of the differentially expressed mRNAs. Further bioinformatics analysis was used to explore the potential function, lncRNA-mRNA correlation, and potential targets of the differentially expressed lncRNAs.</p><p><strong>Results: </strong>We found 3195 lncRNAs and 1966 mRNAs that were differentially expressed. The chromosomal distribution of the differentially expressed lncRNAs and mRNAs was unequal, with chromosome 2 having the highest number of lncRNAs and chromosome 7 having the highest number of mRNAs that were differentially expressed. Pathway analysis of the differentially expressed genes indicated the involvement of several signaling pathways including PI3K-Akt, Hippo, and pathways regulating the pluripotency of stem cells. The differentially expressed genes were also enriched for the GO terms, developmental process and muscle system process. Network analysis identified 8 statistically significant (P<.05) network objects from the upregulated lncRNAs and 3 statistically significant network objects from the downregulated lncRNAs.</p><p><strong>Conclusions: </strong>Our results thus far imply that dysferlinopathy is associated with an aberrant expression of multiple lncRNAs, many of which may have a specific function in the disease process. GO terms and network analysis suggest a muscle-specific role for these lncRNAs. To elucidate the specific roles of these abnormally expressed noncoding RNAs, further studies engineering their expression are required.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"1 1","pages":"e33186"},"PeriodicalIF":0.0,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11135227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42593802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aaron Wendelboe, Ibrahim Saber, Justin Dvorak, Alys Adamski, Natalie Feland, Nimia Reyes, Karon Abe, Thomas Ortel, Gary Raskob
{"title":"Exploring the Applicability of Using Natural Language Processing to Support Nationwide Venous Thromboembolism Surveillance: Model Evaluation Study.","authors":"Aaron Wendelboe, Ibrahim Saber, Justin Dvorak, Alys Adamski, Natalie Feland, Nimia Reyes, Karon Abe, Thomas Ortel, Gary Raskob","doi":"10.2196/36877","DOIUrl":"10.2196/36877","url":null,"abstract":"<p><strong>Background: </strong>Venous thromboembolism (VTE) is a preventable, common vascular disease that has been estimated to affect up to 900,000 people per year. It has been associated with risk factors such as recent surgery, cancer, and hospitalization. VTE surveillance for patient management and safety can be improved via natural language processing (NLP). NLP tools have the ability to access electronic medical records, identify patients that meet the VTE case definition, and subsequently enter the relevant information into a database for hospital review.</p><p><strong>Objective: </strong>We aimed to evaluate the performance of a VTE identification model of IDEAL-X (Information and Data Extraction Using Adaptive Learning; Emory University)-an NLP tool-in automatically classifying cases of VTE by \"reading\" unstructured text from diagnostic imaging records collected from 2012 to 2014.</p><p><strong>Methods: </strong>After accessing imaging records from pilot surveillance systems for VTE from Duke University and the University of Oklahoma Health Sciences Center (OUHSC), we used a VTE identification model of IDEAL-X to classify cases of VTE that had previously been manually classified. Experts reviewed the technicians' comments in each record to determine if a VTE event occurred. The performance measures calculated (with 95% CIs) were accuracy, sensitivity, specificity, and positive and negative predictive values. Chi-square tests of homogeneity were conducted to evaluate differences in performance measures by site, using a significance level of .05.</p><p><strong>Results: </strong>The VTE model of IDEAL-X \"read\" 1591 records from Duke University and 1487 records from the OUHSC, for a total of 3078 records. The combined performance measures were 93.7% accuracy (95% CI 93.7%-93.8%), 96.3% sensitivity (95% CI 96.2%-96.4%), 92% specificity (95% CI 91.9%-92%), an 89.1% positive predictive value (95% CI 89%-89.2%), and a 97.3% negative predictive value (95% CI 97.3%-97.4%). The sensitivity was higher at Duke University (97.9%, 95% CI 97.8%-98%) than at the OUHSC (93.3%, 95% CI 93.1%-93.4%; <i>P</i><.001), but the specificity was higher at the OUHSC (95.9%, 95% CI 95.8%-96%) than at Duke University (86.5%, 95% CI 86.4%-86.7%; <i>P</i><.001).</p><p><strong>Conclusions: </strong>The VTE model of IDEAL-X accurately classified cases of VTE from the pilot surveillance systems of two separate health systems in Durham, North Carolina, and Oklahoma City, Oklahoma. NLP is a promising tool for the design and implementation of an automated, cost-effective national surveillance system for VTE. Conducting public health surveillance at a national scale is important for measuring disease burden and the impact of prevention measures. We recommend additional studies to identify how integrating IDEAL-X in a medical record system could further automate the surveillance process.</p>","PeriodicalId":73552,"journal":{"name":"JMIR bioinformatics and biotechnology","volume":"3 1","pages":"e36877"},"PeriodicalIF":0.0,"publicationDate":"2022-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10193259/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9501826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}