{"title":"Linear regression-based feature selection for microarray data classification.","authors":"Md Abid Hasan, Md Kamrul Hasan, M Abdul Mottalib","doi":"10.1504/ijdmb.2015.066776","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066776","url":null,"abstract":"<p><p>Predicting the class of gene expression profiles helps improve the diagnosis and treatment of diseases. Analysing huge gene expression data otherwise known as microarray data is complicated due to its high dimensionality. Hence the traditional classifiers do not perform well where the number of features far exceeds the number of samples. A good set of features help classifiers to classify the dataset efficiently. Moreover, a manageable set of features is also desirable for the biologist for further analysis. In this paper, we have proposed a linear regression-based feature selection method for selecting discriminative features. Our main focus is to classify the dataset more accurately using less number of features than other traditional feature selection methods. Our method has been compared with several other methods and in almost every case the classification accuracy is higher using less number of features than the other popular feature selection methods.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066776","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33906547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Grey anti-inflammation analysis of phenolic acid phenethyl esters in human neutrophils.","authors":"Ya-Ting Lee, Chian-Song Chiu","doi":"10.1504/ijdmb.2015.066769","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066769","url":null,"abstract":"<p><p>This paper presents grey structure activity relationship analysis for anti-inflammation of phenolic acid phenethyl esters in human neutrophils. To study the anti-inflammation effect, 14 compounds of phenolic acid phenethyl esters are synthesised, while the inhibition on superoxide anion generation (which is linked to an inflammation effect) induced by PMA and fMLP stimulants is detected. Next, the relationship weighting of each functional group of phenolic acid phenethyl esters is found by applying the grey system theory on the measured data. Moreover, evident structure activity relationships are established to regulate the anti-inflammation effect of such compounds, e.g. the most important functional group affecting the anti-inflammation in human neutrophils is revealed. In addition, some extending results are obtained based on the grey analysis. It is interesting that the analysed result is consistent with the actual circumstance. In comparison with traditional methods, this paper applying the grey theory indicates more characteristic information about the structure activity relationships of phenolic acid phenethyl esters while fewer data samples are required.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066769","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33906551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The relationship between experimentally validated intracellular human protein stability and the features of its solvent accessible surface.","authors":"Xiaofeng Song, Yan Jing, Ping Han","doi":"10.1504/ijdmb.2015.066338","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.066338","url":null,"abstract":"<p><p>Protein degradation is critical for most cellular processes, and investigating the degradation signals in the sequence and structure is beneficial for analysing the protein stability. In this paper, we investigated in depth the intrinsic factors affecting the protein degradation based on the sequence and structure features. The results indicated that there are more hydrophobic residues on the surface of short-lived protein than the long-lived protein. The secondary structure such as coil tends to be on the surface of short-lived protein. There are more serine phosphorylation sites on the short-lived protein surface, and there is higher possibility for the short-lived proteins to start the degradation by signal of PEST motif than long-lived proteins. We also found that almost all of N terminal residues are exposed to be on the surface; therefore, the specific features of the solvent accessible surface residues are the key factors affecting intracellular protein stability.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.066338","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"33973463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yifei Chen, Hongjian Guo, Feng Liu, Bernard Manderick
{"title":"Improving protein-protein interaction article classification using biological domain knowledge.","authors":"Yifei Chen, Hongjian Guo, Feng Liu, Bernard Manderick","doi":"10.1504/ijdmb.2015.069415","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069415","url":null,"abstract":"<p><p>Interaction Article Classification (IAC) is a specific text classification application in biological domain that tries to find out which articles describe Protein-Protein Interactions (PPIs) to help extract PPIs from biological literature more efficiently. However, the existing text representation and feature weighting schemes commonly used for text classification are not well suited for IAC. We capture and utilise biological domain knowledge, i.e. gene mentions also known as protein or gene names in the articles, to address the problem. We put forward a new gene mention order-based approach that highlights the important role of gene mentions to represent the texts. Furthermore, we also incorporate the information concerning gene mentions into a novel feature weighting scheme called Gene Mention-based Term Frequency (GMTF). By conducting experiments, we show that using the proposed representation and weighting schemes, our Interaction Article Classifier (IACer) performs better than other leading systems for the moment.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069415","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34123509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Employing social network analysis for disease biomarker detection.","authors":"Tansel Ozyer, Serkan Ucer, Taylan Iyidogan","doi":"10.1504/ijdmb.2015.069661","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.069661","url":null,"abstract":"<p><p>Detection of disease biomarkers in general and cancer biomarkers in particular is an important task which has received considerable attention in the area of in silico genomic experiments. We describe a new approach for detecting cancer biomarkers based on genomic microarray data; it is characterised by employing Social Network Analysis (SNA) techniques. Through social interaction perspective, we can have genes as actors in a social network, where similarities between genes can be described as connections between these actors. The correct determination of biomarkers out of huge genomic data dramatically decreases the number of features. It is also possible to achieve the same or better classification performance compared to using the whole data. The minimum number of biomarkers can be researched further biologically to reduce the numerous time-consuming in vitro experiments. Results of the conducted experiments with selected biomarkers are promising and efficient.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.069661","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhongbo Cao, Yan Wang, Ying Sun, Wei Du, Yanchun Liang
{"title":"A novel filter feature selection method for paired microarray expression data analysis.","authors":"Zhongbo Cao, Yan Wang, Ying Sun, Wei Du, Yanchun Liang","doi":"10.1504/ijdmb.2015.070071","DOIUrl":"https://doi.org/10.1504/ijdmb.2015.070071","url":null,"abstract":"<p><p>In recent years, a large amount of microarray data sets are produced with tens of thousands of genes. Feature selection has become a very sharp tool to select the informative genes. However, few feature selection methods consider the effect of paired samples, which are much more considered in the experiments of these years. Here, we propose a new feature selection method for paired microarray data sets analysis. It uses the fold change instead of the subtraction in the original approach, measures the statistical significant using the q-value of False Discovery Rate (FDR) and also decreases the influence of redundant genes. We compare the proposed method with another six existing methods in predict performance, stability of gene lists, functional stability and functional enrichment analysis using six kinds of paired cancer data sets. Comparison results show that our proposed method achieves better effectiveness, stability and consistency when it is applied to paired data sets.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2015-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2015.070071","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34192163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analyzing Large Biological Datasets with an Improved Algorithm for MIC","authors":"Shuliang Wang, Yiping Zhao","doi":"10.1504/IJDMB.2015.071548","DOIUrl":"https://doi.org/10.1504/IJDMB.2015.071548","url":null,"abstract":"The computational framework used the traditional similarity measures to find out the significant relationships in biological annotations. But its prerequisites that the biological annotations do not cooccur with each other is particular. To overcome it, in this paper a new method Improved Algorithm for Maximal Information Coefficient (IAMIC) is suggested to discover the hidden regularities between biological annotations. IAMIC approximates a novel similarity coefficient on maximal information coefficient with generality and equitability, by bettering axis partition through quadratic optimisation instead of violence search. The experimental results show that IAMIC is more appropriate for identifying the associations between biological annotations, and further extracting the novel associations hidden in collected data sets than other similarity measures.","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2014-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/IJDMB.2015.071548","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"66730538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Le Zhang, Yao Xue, Beini Jiang, Costas Strouthos, Zhenfeng Duan, Yukun Wu, Jing Su, Xiaobo Zhou
{"title":"Multiscale agent-based modelling of ovarian cancer progression under the stimulation of the STAT 3 pathway.","authors":"Le Zhang, Yao Xue, Beini Jiang, Costas Strouthos, Zhenfeng Duan, Yukun Wu, Jing Su, Xiaobo Zhou","doi":"10.1504/ijdmb.2014.060050","DOIUrl":"https://doi.org/10.1504/ijdmb.2014.060050","url":null,"abstract":"<p><p>This research is developed to simulate ovarian cancer progression with signal transducers and activators of the transcription 3 (STAT 3) pathway. The main focus is on studying how the STAT 3 pathway affects the cancer cells' biomechanical phenotype under the stimulation of the interleukin-6 (IL-6) cytokine and various well-known microscopic factors. The simulated results agreed with recent experimental evidence that ovarian cancer cells with a stimulated STAT 3 pathway have high survival rates and drug resistance. And we discussed how the IL6 and these well-known microscopic factors impacted the cancer progression.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2014.060050","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32619279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Development of 3D-QSAR combination approach for discovering and analysing neuraminidase inhibitors in silico.","authors":"Chun-Yuan Lin, Hsiao-Chieh Chi, Kuei-Chung Shih, Jiayi Zhou, Nai-Wan Hsiao, Chuan-Yi Tang","doi":"10.1504/ijdmb.2014.060053","DOIUrl":"https://doi.org/10.1504/ijdmb.2014.060053","url":null,"abstract":"<p><p>Zanamivir and Oseltamivir are both sialic acid analog inhibitors of Neuraminidase (NA), which is an important target in influenza A virus treatment. Quantitative Structure-Activity Relationships (QSAR) is a common computational method for correlating the structural properties of compounds (or inhibitors) with their biological activities. The pharmcophore model easily and quickly recognises related inhibitors and also fits the binding site interaction features of a protein structure. The Comparative Molecular Similarity Index Analysis (CoMSIA) model easily optimises molecular structures and describes the limit range of molecule weights. This study proposes a combination approach that integrates these two models based on the same training set inhibitors in order to screen and optimize NA inhibitor candidates during drug design.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2014.060053","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32619286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"OligoSpecificitySystem: global matching efficiency calculation of oligonucleotide sets taking into account degeneracy and mismatch possibilities.","authors":"R J Michelland, S Combes, L Cauquil","doi":"10.1504/ijdmb.2014.062148","DOIUrl":"https://doi.org/10.1504/ijdmb.2014.062148","url":null,"abstract":"<p><p>Oligonucleotide sets are widely used in molecular biology to target a group of nucleic acid sequences using Polymerase Chain Reaction (PCR)-based technologies. Currently, the global matching efficiency of an oligonucleotide set is considered to be equal to the lower matching efficiency calculated for each oligonucleotide. However, sequences matching the limiting oligonucleotide did not always match the other oligonucleotide of the set, resulting in a biased evaluation of the matching efficiency. The OligoSpecificitySystem program avoid this bias by calculations of the real global matching efficiency of oligonucleotide sets. It can process all kinds of oligonucleotide sets, including the number of oligonucleotides, base pair degeneracy occurrences or mismatch occurrences.</p>","PeriodicalId":54964,"journal":{"name":"International Journal of Data Mining and Bioinformatics","volume":null,"pages":null},"PeriodicalIF":0.3,"publicationDate":"2014-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1504/ijdmb.2014.062148","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"32999677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}