{"title":"Comparison between Complete and Ward’s Linkage Method in Hierarchical Clustering Analysis on Cancer Omics Dataset","authors":"Chen Xinyi","doi":"10.1109/icbcb55259.2022.9802487","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802487","url":null,"abstract":"Diseases, cancer as a particular example, can arise from a multitude of genetic and epigenetic changes. Studying gene expression profiles from tumor samples from cancer patients can reveal information about novel cancer subtypes. With the development of analytical approach, clustering methods are widely used on biomedical high-dimensional data, such as omics data, to find groups of samples that have similar profiles and identify subtypes of cancer. In our study, we applied hierarchical clustering on high dimensional mRNA-seq data to cluster the subtypes of cancer. Our focus is to compare the performance of different linkage methods—complete method and Ward’s method in hierarchical clustering and investigate the characteristics of datasets with which a more suitable linkage measure should be used. Our result shows that for dispersed dataset (Kurtosis>0.1, CV>5), Ward’s method performs better than complete method. On the other hand, complete method achieves more accurate clustering results than Ward’s method when it is used to analyze relatively more aggregated data (Kurtosis<0.1, CV<5).","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125581776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
X. Wu, Bohan Yu, P. Liu, Huaiyu Zhu, Jianing Li, Haotian Wang, W. Luo, Pan Yun
{"title":"A Wearable Multi-sensor System for Classification of Multiple System Atrophy and Parkinson's Disease","authors":"X. Wu, Bohan Yu, P. Liu, Huaiyu Zhu, Jianing Li, Haotian Wang, W. Luo, Pan Yun","doi":"10.1109/icbcb55259.2022.9802460","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802460","url":null,"abstract":"Multiple system atrophy (MSA) is an atypical parkinsonism disorder with faster progression and clinical symptoms similar to Parkinson’s disease (PD). Thus, it is critical to discriminate the diseases as early as possible to provide better therapies for patients and gain the maximum profits. Although some methods, such as positron emission tomography and cerebrospinal fluid, have good performance in clinical practice, those methods are limited since they would increase the body burden and bring extra cost to patients. Recently, significant differences have been proven on spatiotemporal gait features between MSA and PD, however, to the best of our knowledge, there remains no research on making differential diagnosis between MSA and PD by gait analysis. Therefore, in this work, we design a wearable multi-sensor system based on inertial sensors to collect gait information from MSA and PD patients, respectively, and analyze their gait information to make differential diagnosis between the two above diseases with similar symptoms. We evaluated the proposed system on a total 10 MSA and 21 PD patients. As a result, the performance of proposed system reached 89.1% sensitivity, 89.1% specificity and 89.4% accuracy for the classification between MSA and PD.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122800612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Early Diagnosis of Parkinson's Disease by Analyzing Magnetic Resonance Imaging Brain Scans and Patient Characteristic","authors":"Sabrina Zhu","doi":"10.1109/icbcb55259.2022.9802132","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802132","url":null,"abstract":"Parkinson’s disease (PD) is a chronic condition that affects motor skills and includes symptoms like tremors and rigidity. The current diagnostic procedure uses patient assessments to evaluate symptoms and sometimes a magnetic resonance imaging (MRI) scan. However, symptom variations cause inaccurate assessments, and the analysis of MRI scans requires experienced specialists. This research proposes to use deep learning to diagnose PD severity by combining symptoms data and MRI data, all of which comes from the public Parkinson’s Progression Markers Initiative (PPMI) database, in order to provide specialists and patients with more flexibility. A new hybrid model architecture was implemented to fully utilize both forms of clinical data to evaluate PD severity with high accuracy, and models based on only symptoms and only MRI scans were also developed. The developed model integrates a fully connected deep learning neural network for symptoms data training and a transfer learning-based convolutional neural network for MRI scans training. Instead of performing only binary classification, all models classify patients into five severity categories, with stage zero representing healthy patients and stages four and five representing patients with PD. The symptoms-only, MRI scans-only and hybrid models achieved accuracies of 0.77, 0.68, and 0.94, respectively. The hybrid model also had high precision and recall scores of 0.94 and 0.95. Real clinical cases confirm the hybrid model’s strong performance, where patients were classified incorrectly with both other models but correctly by the hybrid. It is also consistent across the five 0-4 severity stages, so early detection of PD is accurate.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125468104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design and Analysis of MEMS Thermopile Sensor with Multiple Absorption Regions","authors":"Dinghu Zha, Chuanwei Qiao, Peiyu Zhang","doi":"10.1109/icbcb55259.2022.9802489","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802489","url":null,"abstract":"With the maturity of MEMS technology, MEMS thermopile sensors are widely used in the market. In this paper, a thermopile structure with parallel arrangement of thermocouple strips and multiple infrared absorption zones is designed, and the thermal model of the structure is established, the relationship between the output performance and the parameters of each part is obtained, and the structure size is optimized. On this basis, thermoelectric simulation and thermal path analysis of the sensor are carried out, and the optimal results are obtained: the responsivity is 423.3V/W, the detectivity is 3.49×108 cmHz1/2/W, the noise equivalent power is 1.8×10-10W/(Hz)1/2, and the response time is 20.9 ms. Compared with the reported thermopile sensor, the output performance is greatly improved. At the end of this paper, the processing process is also introduced.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126578074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tutorial on 8 Genotype Files Conversion","authors":"Muhammad Muneeb, Samuel F. Feng, Andreas Henschel","doi":"10.1109/icbcb55259.2022.9802470","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802470","url":null,"abstract":"This article documents the files format conversion procedures for eight different genotype file formats using existing tools like Plink, Samtools, Gtools, and custom code script where necessary. It provides documentation and the corresponding code segment for each conversion to serve conversion procedures in a plate to beginners and researchers to build on top of the existing code to develop enhanced and fast conversion procedures. The code is written in Python and GNU commands, enabling deployment from general-purpose computers to high-performance computing setups. In addition, the documentation is written in the form of the tutorial, highlighting the reason for using a particular step in the conversion procedure and its effect on intermediate genotype data, ultimately enhancing the comprehension abilities of people struggling with file conversion when developing their pipelines for the analysis. In the first version of the documentation, we considered eight file formats: VCF, BED-BIM-FAM, PED-MAP, GEN-SAMPLE, RAW, HAPS-LEGEND-SAMPLE, 23andme, and AncestryDNA.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115136434","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zhe Liu, Weihao Pan, Xu Zhen, Ji Liang, Wenxiang Cai, Kai Yuan, G. Lin
{"title":"Will AlphaFold2 Be Helpful in Improving the Accuracy of Single-sequence PPI Site Prediction?","authors":"Zhe Liu, Weihao Pan, Xu Zhen, Ji Liang, Wenxiang Cai, Kai Yuan, G. Lin","doi":"10.1109/icbcb55259.2022.9802490","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802490","url":null,"abstract":"AlphaFold2 has achieved relatively high structure prediction accuracy on proteins. However, it is reported that directly feeding coordinates into deep learning models cannot achieve ideal results on downstream tasks. Therefore, how to process the predicted results into an effective form that deep learning networks can understand to improve the performance of downstream tasks is worth exploring. In this study, taking single-sequence PPI site prediction as an example, we verified the effects of three processing strategies of coordinates, namely spatial Altering, SVD20, and the rASA feature calculation. The experiment results showed that spatial filtering and the rASA feature were two effective and suitable ways to encode structural information for deep learning models. Besides, we also performed a case study of a mutated protein. The results proved that spatial filtering might potentially introduce structural changes into HHblits profiles and deep learning networks when protein mutations occur. This work provides new insight into the downstream tasks, such as predicting the binding sites of proteins or predicting the effects of mutations.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":" 36","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120833408","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"The Study of Disease Mechanisms Based on Cascading Failure","authors":"Dandan Zhang, Yanhui Wang","doi":"10.1109/icbcb55259.2022.9802465","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802465","url":null,"abstract":"Studying genes closely related to diseases from the perspective of system evolution is helpful to comprehensively understand the pathogenesis of diseases. Based on the cascading failure load-capacity model, this paper gives a method to screen the key fault nodes between two control groups by using the impact of failed nodes on other nodes, called the cascading failure key nodes method (CFKNM). Taking breast cancer (BC) (GSE15852) as an example, 28 genes with significant difference between control group and BC group are screened, among which 14 genes had been confirmed to be significantly correlated with BC, and they are significantly correlated with cell growth, apoptosis and metastasis, or biomarkers and therapeutic targets for breast cancer. This predicts that the method is effective. In addition, the method predicts that C2CD2, HSD11B1 and FMO2 are significantly correlated with breast cancer, although further laboratory validation is still needed.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121194223","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Improved Protein Secondary Structure Prediction Using Bidirectional Long Short-Term Memory Neural Network and Bootstrap Aggregating","authors":"Wenfei Zeng, Ning-Xin Jia, Junda Hu","doi":"10.1109/icbcb55259.2022.9802482","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802482","url":null,"abstract":"Accurate predicting protein secondary structure information is essential to identify structural classes, folds, and tertiary structures of proteins. In this study, we propose an accurate predictor, BiBagPSS, for predicting protein secondary structure information based on integrating Bidirectional Long Short-Term Memory (BiLSTM) neural network, fully connection (FC) neural network, and the strategy of bootstrap aggregating (Bagging). In BiBagPSS, three different feature views, i.e., position-specific scoring matrix (PSSM), hidden Markov model profile (HMM), and predicted solvent accessibility probability matrix (PSAPM), are first employed to extract different protein-level features. Secondly, the above three features are combined and fed into a stacked neural network composed of the units of BiLSTM and FC. Thirdly, the predicted secondary structure probability matrix (PSSPM) generated by trained model is then added to the input features for re-training the model. In order to fully dig out available information from the training data set, we employ the strategy of bootstrap aggregating to train multiple stacked neural network models. Finally, according to the voting results of the above models, the secondary structure state of each protein residue could be determined. Experimental results show that BiBagPSS achieves Q3 scores of 82.39 and 77.30, Q8 scores of 69.95 and 65.61 on TEST524 and CASP14set data sets, respectively, which are higher than or comparable to most of the state-of-the-art predictors. Detailed data analyses show that the major advantage of BiBagPSS lies in the utilization of the PSSPM that helps extract more discriminative information compared with the previously used machine learning algorithms. Meanwhile, the Bagging strategy improves the ability of BiBagPSS to mine available information.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116507769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Conversion to Mild Cognitive Impairment in Cognitively Normal with Incomplete Multi-modal Neuroimages","authors":"Yuqing Sun, Yong Liu, Bing Liu","doi":"10.1109/icbcb55259.2022.9802479","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802479","url":null,"abstract":"Assessing clinical progression from cognitively normal (CN) to mild cognitive impairment (MCI) is crucial for early intervention before the onset of cognitive decline. Multi-modal neuroimaging data has provided supplementary biomarkers for computer-aided prediction of neurodegeneration diseases. However, it is still unknown whether tau uptake in positron emission tomography (PET) provides much power for identifying progressive CN who will convert to MCI, since subjects usually lack tau PET scans. In this study, we proposed a neuroimage synthesis network to impute missing tau PET images based on their corresponding T1-weighted magnetic resonance imaging (MRI) scans. With the real MRI and synthetic PET data after imputation, we applied support vector machine classifiers on regional measurement of anatomical features extracted from pre-defined atlases for prediction. Experimental results on Alzheimer's Disease Neuroimaging Initiative dataset suggest that our neuroimage synthesis network synthesized reasonable neuroimages and complementary information provided by tau PET improved the accuracy of identification.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124908908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of Hub Genes and Key Pathways in TNF-α and IFN-γ Induced Cytokine Storms via Bioinformatics","authors":"Ryan Christian Mailem, L. Tayo","doi":"10.1109/icbcb55259.2022.9802459","DOIUrl":"https://doi.org/10.1109/icbcb55259.2022.9802459","url":null,"abstract":"Cytokine storms, an overaggressive immune response due to the overexpression of pro-inflammatory cytokines, have been identified to play a significant role in COVID-19 infections. Studies have shown that TNF-α and IFN-γ are integral to the process, however, its genetic mechanisms have yet to be fully elucidated. Herein, the key changes in the gene expression of TNF-α and IFN-γ induced cytokine storms are identified through differential gene analysis on the publicly available GEO GSE160163 dataset. GO and KEGG enrichment were used to annotate identified DEGs, and a PPI network was constructed based on the STRING database. A total of 446 differentially expressed genes were identified. Up-regulated genes and downregulated genes were enriched in viral immune response and infection pathways, and steroid biosynthesis and metabolic pathways, respectively. PPI construction revealed 1,834 interactions between 428 proteins, indicating their biological connectivity. Module analysis identified nine (9) hub genes: STAT1, CXCL10, CD274, CXCL9, IRF1, PSMB9, CD86, STAT3, and CXCR4, involved in viral immune response and three (3) significant modules involved in NOD-like receptor signaling, steroid biosynthesis, and viral infections. These identified DEGs, hub genes, and their respective enriched pathways aid us in understanding the molecular mechanisms of cytokine storms, as well as provide potential gene targets and druggable receptors for the treatment of cytokine storms.","PeriodicalId":429633,"journal":{"name":"2022 10th International Conference on Bioinformatics and Computational Biology (ICBCB)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124057128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}