{"title":"Tools to identify linear combination of prognostic factors which maximizes area under receiver operator curve.","authors":"Nicolae Todor, Irina Todor, Gavril Săplăcan","doi":"10.1186/2043-9113-4-10","DOIUrl":"https://doi.org/10.1186/2043-9113-4-10","url":null,"abstract":"<p><strong>Background: </strong>The linear combination of variables is an attractive method in many medical analyses targeting a score to classify patients. In the case of ROC curves the most popular problem is to identify the linear combination which maximizes area under curve (AUC). This problem is complete closed when normality assumptions are met. With no assumption of normality search algorithm are avoided because it is accepted that we have to evaluate AUC n(d) times where n is the number of distinct observation and d is the number of variables.</p><p><strong>Methods: </strong>For d = 2, using particularities of AUC formula, we described an algorithm which lowered the number of evaluations of AUC from n(2) to n(n-1) + 1. For d > 2 our proposed solution is an approximate method by considering equidistant points on the unit sphere in R(d) where we evaluate AUC.</p><p><strong>Results: </strong>The algorithms were applied to data from our lab to predict response of treatment by a set of molecular markers in cervical cancers patients. In order to evaluate the strength of our algorithms a simulation was added.</p><p><strong>Conclusions: </strong>In the case of no normality presented algorithms are feasible. For many variables computation time could be increased but acceptable.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"10"},"PeriodicalIF":0.0,"publicationDate":"2014-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-10","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32539949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jitsuki Sawamura, Shigeru Morishita, Jun Ishigooka
{"title":"Interpretation for scales of measurement linking with abstract algebra.","authors":"Jitsuki Sawamura, Shigeru Morishita, Jun Ishigooka","doi":"10.1186/2043-9113-4-9","DOIUrl":"10.1186/2043-9113-4-9","url":null,"abstract":"<p><p>THE STEVENS CLASSIFICATION OF LEVELS OF MEASUREMENT INVOLVES FOUR TYPES OF SCALE: \"Nominal\", \"Ordinal\", \"Interval\" and \"Ratio\". This classification has been used widely in medical fields and has accomplished an important role in composition and interpretation of scale. With this classification, levels of measurements appear organized and validated. However, a group theory-like systematization beckons as an alternative because of its logical consistency and unexceptional applicability in the natural sciences but which may offer great advantages in clinical medicine. According to this viewpoint, the Stevens classification is reformulated within an abstract algebra-like scheme; 'Abelian modulo additive group' for \"Ordinal scale\" accompanied with 'zero', 'Abelian additive group' for \"Interval scale\", and 'field' for \"Ratio scale\". Furthermore, a vector-like display arranges a mixture of schemes describing the assessment of patient states. With this vector-like notation, data-mining and data-set combination is possible on a higher abstract structure level based upon a hierarchical-cluster form. Using simple examples, we show that operations acting on the corresponding mixed schemes of this display allow for a sophisticated means of classifying, updating, monitoring, and prognosis, where better data mining/data usage and efficacy is expected. </p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"9"},"PeriodicalIF":0.0,"publicationDate":"2014-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-9","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32472854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"comoR: a software for disease comorbidity risk assessment.","authors":"Mohammad Ali Moni, Pietro Liò","doi":"10.1186/2043-9113-4-8","DOIUrl":"https://doi.org/10.1186/2043-9113-4-8","url":null,"abstract":"<p><strong>Background: </strong>The diagnosis of comorbidities, which refers to the coexistence of different acute and chronic diseases, is difficult due to the modern extreme specialisation of physicians. We envisage that a software dedicated to comorbidity diagnosis could result in an effective aid to the health practice.</p><p><strong>Results: </strong>We have developed an R software comoR to compute novel estimators of the disease comorbidity associations. Starting from an initial diagnosis, genetic and clinical data of a patient the software identifies the risk of disease comorbidity. Then it provides a pipeline with different causal inference packages (e.g. pcalg, qtlnet etc) to predict the causal relationship of diseases. It also provides a pipeline with network regression and survival analysis tools (e.g. Net-Cox, rbsurv etc) to predict more accurate survival probability of patients. The input of this software is the initial diagnosis for a patient and the output provides evidences of disease comorbidity mapping.</p><p><strong>Conclusions: </strong>The functions of the comoR offer flexibility for diagnostic applications to predict disease comorbidities, and can be easily integrated to high-throughput and clinical data analysis pipelines.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"8"},"PeriodicalIF":0.0,"publicationDate":"2014-05-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-8","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32520603","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Estimating age-dependent per-encounter chlamydia trachomatis acquisition risk via a Markov-based state-transition model.","authors":"Yu Teng, Nan Kong, Wanzhu Tu","doi":"10.1186/2043-9113-4-7","DOIUrl":"10.1186/2043-9113-4-7","url":null,"abstract":"<p><strong>Background: </strong>Chlamydial infection is a common bacterial sexually transmitted infection worldwide, caused by C. trachomatis. The screening for C. trachomatis has been proven to be successful. However, such success is not fully realized through tailoring the recommended screening strategies for different age groups. This is partly due to the knowledge gap in understanding how the infection is correlated with age. In this paper, we estimate age-dependent risks of acquiring C. trachomatis by adolescent women via unprotected heterosexual acts.</p><p><strong>Methods: </strong>We develop a time-varying Markov state-transition model and compute the incidences of chlamydial infection at discrete age points by simulating the state-transition model with candidate per-encounter acquisition risks and sampled numbers of unit-time unprotected coital events at different age points. We solve an optimization problem to identify the age-dependent estimates that offer the closest matches to the observed infection incidences. We also investigate the impact of antimicrobial treatment effectiveness on the parameter estimates and the differences between the acquisition risks for the first-time infections and repeated infections.</p><p><strong>Results: </strong>Our case study supports the beliefs that age is an inverse predictor of C. trachomatis transmission and that protective immunity developed after initial infection is only partial.</p><p><strong>Conclusions: </strong>Our modeling method offers a flexible and expandable platform for investigating STI transmission.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"7"},"PeriodicalIF":0.0,"publicationDate":"2014-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4022339/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32378765","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Cyprien Mbogning, Hervé Perdry, Wilson Toussile, Philippe Broët
{"title":"A novel tree-based procedure for deciphering the genomic spectrum of clinical disease entities.","authors":"Cyprien Mbogning, Hervé Perdry, Wilson Toussile, Philippe Broët","doi":"10.1186/2043-9113-4-6","DOIUrl":"10.1186/2043-9113-4-6","url":null,"abstract":"<p><strong>Background: </strong>Dissecting the genomic spectrum of clinical disease entities is a challenging task. Recursive partitioning (or classification trees) methods provide powerful tools for exploring complex interplay among genomic factors, with respect to a main factor, that can reveal hidden genomic patterns. To take confounding variables into account, the partially linear tree-based regression (PLTR) model has been recently published. It combines regression models and tree-based methodology. It is however computationally burdensome and not well suited for situations for which a large number of exploratory variables is expected.</p><p><strong>Methods: </strong>We developed a novel procedure that represents an alternative to the original PLTR procedure, and considered different selection criteria. A simulation study with different scenarios has been performed to compare the performances of the proposed procedure to the original PLTR strategy.</p><p><strong>Results: </strong>The proposed procedure with a Bayesian Information Criterion (BIC) achieved good performances to detect the hidden structure as compared to the original procedure. The novel procedure was used for analyzing patterns of copy-number alterations in lung adenocarcinomas, with respect to Kirsten Rat Sarcoma Viral Oncogene Homolog gene (KRAS) mutation status, while controlling for a cohort effect. Results highlight two subgroups of pure or nearly pure wild-type KRAS tumors with particular copy-number alteration patterns.</p><p><strong>Conclusions: </strong>The proposed procedure with a BIC criterion represents a powerful and practical alternative to the original procedure. Our procedure performs well in a general framework and is simple to implement.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 ","pages":"6"},"PeriodicalIF":0.0,"publicationDate":"2014-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4129184/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32267694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Combined analysis of chromosomal instabilities and gene expression for colon cancer progression inference.","authors":"Claudia Cava, Italo Zoppis, Manuela Gariboldi, Isabella Castiglioni, Giancarlo Mauri, Marco Antoniotti","doi":"10.1186/2043-9113-4-2","DOIUrl":"https://doi.org/10.1186/2043-9113-4-2","url":null,"abstract":"<p><strong>Background: </strong>Copy number alterations (CNAs) represent an important component of genetic variations. Such alterations are related with certain type of cancer including those of the pancreas, colon, and breast, among others. CNAs have been used as biomarkers for cancer prognosis in multiple studies, but few works report on the relation of CNAs with the disease progression. Moreover, most studies do not consider the following two important issues. (I) The identification of CNAs in genes which are responsible for expression regulation is fundamental in order to define genetic events leading to malignant transformation and progression. (II) Most real domains are best described by structured data where instances of multiple types are related to each other in complex ways.</p><p><strong>Results: </strong>Our main interest is to check whether the colorectal cancer (CRC) progression inference benefits when considering both (I) the expression levels of genes with CNAs, and (II) relationships (i.e. dissimilarities) between patients due to expression level differences of the altered genes. We first evaluate the accuracy performance of a state-of-the-art inference method (support vector machine) when subjects are represented only through sets of available attribute values (i.e. gene expression level). Then we check whether the inference accuracy improves, when explicitly exploiting the information mentioned above. Our results suggest that the CRC progression inference improves when the combined data (i.e. CNA and expression level) and the considered dissimilarity measures are applied.</p><p><strong>Conclusions: </strong>Through our approach, classification is intuitively appealing and can be conveniently obtained in the resulting dissimilarity spaces. Different public datasets from Gene Expression Omnibus (GEO) were used to validate the results.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 1","pages":"2"},"PeriodicalIF":0.0,"publicationDate":"2014-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32056019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Clinical detection of human probiotics and human pathogenic bacteria by using a novel high-throughput platform based on next generation sequencing.","authors":"Chih-Min Chiu, Feng-Mao Lin, Tzu-Hao Chang, Wei-Chih Huang, Chao Liang, Ting Yang, Wei-Yun Wu, Tzu-Ling Yang, Shun-Long Weng, Hsien-Da Huang","doi":"10.1186/2043-9113-4-1","DOIUrl":"https://doi.org/10.1186/2043-9113-4-1","url":null,"abstract":"<p><strong>Background: </strong>The human body plays host to a vast array of bacteria, found in oral cavities, skin, gastrointestinal tract and the vagina. Some bacteria are harmful while others are beneficial to the host. Despite the availability of many methods to identify bacteria, most of them are only applicable to specific and cultivable bacteria and are also tedious. Based on high throughput sequencing technology, this work derives 16S rRNA sequences of bacteria and analyzes probiotics and pathogens species.</p><p><strong>Results: </strong>We constructed a database that recorded the species of probiotics and pathogens from literature, along with a modified Smith-Waterman algorithm for assigning the taxonomy of the sequenced 16S rRNA sequences. We also constructed a bacteria disease risk model for seven diseases based on 98 samples. Applicability of the proposed platform is demonstrated by collecting the microbiome in human gut of 13 samples.</p><p><strong>Conclusions: </strong>The proposed platform provides a relatively easy means of identifying a certain amount of bacteria and their species (including uncultivable pathogens) for clinical microbiology applications. That is, detecting how probiotics and pathogens inhabit humans and how affect their health can significantly contribute to develop a diagnosis and treatment method.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":"4 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2014-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-4-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32023903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"PROGgene: gene expression based survival analysis web application for multiple cancers.","authors":"Chirayu Pankaj Goswami, Harikrishna Nakshatri","doi":"10.1186/2043-9113-3-22","DOIUrl":"https://doi.org/10.1186/2043-9113-3-22","url":null,"abstract":"<p><strong>Background: </strong>Identification of prognostic mRNA biomarkers has been done for various cancer types. The data that are published from such studies are archived in public repositories. There are hundreds of such datasets available for multiple cancer types in public repositories. Wealth of such data can be utilized to study prognostic implications of mRNA in different cancers as well as in different populations or subtypes of same cancer.</p><p><strong>Description: </strong>We have created a web application that can be used for studying prognostic implications of mRNA biomarkers in a variety of cancers. We have compiled data from public repositories such as GEO, EBI Array Express and The Cancer Genome Atlas for creating this tool. With 64 patient series from 18 cancer types in our database, this tool provides the most comprehensive resource available for survival analysis to date. The tool is called PROGgene and it is available at http://www.compbio.iupui.edu/proggene.</p><p><strong>Conclusions: </strong>We present this tool as a hypothesis generation tool for researchers to identify potential prognostic mRNA biomarkers to follow up with further research. For this reason, we have kept the web application very simple and straightforward. We believe this tool will be useful in accelerating biomarker discovery in cancer and quickly providing results that may indicate disease-specific prognostic value of specific biomarkers.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":" ","pages":"22"},"PeriodicalIF":0.0,"publicationDate":"2013-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-3-22","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40271574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stefano Di Carlo, Gianfranco Politano, Alessandro Savino, Alfredo Benso
{"title":"A systematic analysis of a mi-RNA inter-pathway regulatory motif.","authors":"Stefano Di Carlo, Gianfranco Politano, Alessandro Savino, Alfredo Benso","doi":"10.1186/2043-9113-3-20","DOIUrl":"https://doi.org/10.1186/2043-9113-3-20","url":null,"abstract":"<p><strong>Background: </strong>The continuing discovery of new types and functions of small non-coding RNAs is suggesting the presence of regulatory mechanisms far more complex than the ones currently used to study and design Gene Regulatory Networks. Just focusing on the roles of micro RNAs (miRNAs), they have been found to be part of several intra-pathway regulatory motifs. However, inter-pathway regulatory mechanisms have been often neglected and require further investigation.</p><p><strong>Results: </strong>In this paper we present the result of a systems biology study aimed at analyzing a high-level inter-pathway regulatory motif called Pathway Protection Loop, not previously described, in which miRNAs seem to play a crucial role in the successful behavior and activation of a pathway. Through the automatic analysis of a large set of public available databases, we found statistical evidence that this inter-pathway regulatory motif is very common in several classes of KEGG Homo Sapiens pathways and concurs in creating a complex regulatory network involving several pathways connected by this specific motif. The role of this motif seems also confirmed by a deeper review of other research activities on selected representative pathways.</p><p><strong>Conclusions: </strong>Although previous studies suggested transcriptional regulation mechanism at the pathway level such as the Pathway Protection Loop, a high-level analysis like the one proposed in this paper is still missing. The understanding of higher-level regulatory motifs could, as instance, lead to new approaches in the identification of therapeutic targets because it could unveil new and \"indirect\" paths to activate or silence a target pathway. However, a lot of work still needs to be done to better uncover this high-level inter-pathway regulation including enlarging the analysis to other small non-coding RNA molecules.</p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":" ","pages":"20"},"PeriodicalIF":0.0,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-3-20","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40258735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Potential identification of pediatric asthma patients within pediatric research database using low rank matrix decomposition.","authors":"Teeradache Viangteeravat","doi":"10.1186/2043-9113-3-16","DOIUrl":"https://doi.org/10.1186/2043-9113-3-16","url":null,"abstract":"<p><p>Asthma is a prevalent disease in pediatric patients and most of the cases begin at very early years of life in children. Early identification of patients at high risk of developing the disease can alert us to provide them the best treatment to manage asthma symptoms. Often evaluating patients with high risk of developing asthma from huge data sets (e.g., electronic medical record) is challenging and very time consuming, and lack of complex analysis of data or proper clinical logic determination might produce invalid results and irrelevant treatments. In this article, we used data from the Pediatric Research Database (PRD) to develop an asthma prediction model from past All Patient Refined Diagnosis Related Groupings (APR-DRGs) coding assignments. The knowledge gleamed in this asthma prediction model, from both routinely use by physicians and experimental findings, will become fused into a knowledge-based database for dissemination to those involved with asthma patients. Success with this model may lead to expansion with other diseases. </p>","PeriodicalId":73663,"journal":{"name":"Journal of clinical bioinformatics","volume":" ","pages":"16"},"PeriodicalIF":0.0,"publicationDate":"2013-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/2043-9113-3-16","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"31764931","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}