Chloe Soohyun Jang, Wanson Choi, Seungho Cook, B. Han
{"title":"Analysis of differences in human leukocyte antigen between the two Wellcome Trust Case Control Consortium control datasets","authors":"Chloe Soohyun Jang, Wanson Choi, Seungho Cook, B. Han","doi":"10.5808/gi.2019.17.3.e29","DOIUrl":"https://doi.org/10.5808/gi.2019.17.3.e29","url":null,"abstract":"The Wellcome Trust Case Control Consortium (WTCCC) study was a large genome-wide association study that aimed to identify common variants associated with seven diseases. That study combined two control datasets (58C and UK Blood Services) as shared controls. Prior to using the combined controls, the WTCCC performed analyses to show that the genomic content of the control datasets was not significantly different. Recently, the analysis of human leukocyte antigen (HLA) genes has become prevalent due to the development of HLA imputation technology. In this project, we extended the between-control homogeneity analysis of the WTCCC to HLA. We imputed HLA information in the WTCCC control dataset and showed that the HLA content was not significantly different between the two control datasets, suggesting that the combined controls can be used as controls for HLA fine-mapping analysis based on HLA imputation.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43993747","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In silico approach to calculate the transcript capacity","authors":"Young-Sup Lee, Kyung-Hye Won, Jae-Don Oh, Donghyun Shin","doi":"10.5808/GI.2019.17.3.e31","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e31","url":null,"abstract":"We sought the novel concept, transcript capacity (TC) and analyzed TC. Our approach to estimate TC was through an in silico method. TC refers to the capacity that a transcript exerts in a cell as enzyme or protein function after translation. We used the genome-wide association study (GWAS) beta effect and transcription level in RNA-sequencing to estimate TC. The trait was body fat percent and the transcript reads were obtained from the human protein atlas. The assumption was that the GWAS beta effect is the gene’s effect and TC was related to the corresponding gene effect and transcript reads. Further, we surveyed gene ontology (GO) in the highest TC and the lowest TC genes. The most frequent GOs with the highest TC were neuronal-related and cell projection organization related. The most frequent GOs with the lowest TC were wound-healing related and embryo development related. We expect that our analysis contributes to estimating TC in the diverse species and playing a benevolent role to the new bioinformatic analysis.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49468368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Genomics & informaticsPub Date : 2019-09-01Epub Date: 2019-09-27DOI: 10.5808/GI.2019.17.3.e32
Sol A Jeon, Jong Lyul Park, Jong-Hwan Kim, Jeong Hwan Kim, Yong Sung Kim, Jin Cheon Kim, Seon-Young Kim
{"title":"Comparison of the MGISEQ-2000 and Illumina HiSeq 4000 sequencing platforms for RNA sequencing.","authors":"Sol A Jeon, Jong Lyul Park, Jong-Hwan Kim, Jeong Hwan Kim, Yong Sung Kim, Jin Cheon Kim, Seon-Young Kim","doi":"10.5808/GI.2019.17.3.e32","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e32","url":null,"abstract":"<p><p>Currently, Illumina sequencers are the globally leading sequencing platform in the next-generation sequencing market. Recently, MGI Tech launched a series of new sequencers, including the MGISEQ-2000, which promise to deliver high-quality sequencing data faster and at lower prices than Illumina's sequencers. In this study, we compared the performance of two major sequencers (MGISEQ-2000 and HiSeq 4000) to test whether the MGISEQ-2000 sequencer delivers high-quality sequence data as suggested. We performed RNA sequencing of four human colon cancer samples with the two platforms, and compared the sequencing quality and expression values. The data produced from the MGISEQ-2000 and HiSeq 4000 showed high concordance, with Pearson correlation coefficients ranging from 0.98 to 0.99. Various quality control (QC) analyses showed that the MGISEQ-2000 data fulfilled the required QC measures. Our study suggests that the performance of the MGISEQ-2000 is comparable to that of the HiSeq 4000 and that the MGISEQ-2000 can be a useful platform for sequencing.</p>","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"17 3","pages":"e32"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6808641/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41224609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Direct-to-consumer genetic testing","authors":"Jong-Won Kim","doi":"10.5808/GI.2019.17.3.e34","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e34","url":null,"abstract":"Direct-to-consumer (DTC) genetic testing is a controversial issue although Korean Government is considering to expand DTC genetic testing. Preventing the exaggeration and abusing of DTC genetic testing is an important task considering the early history of DTC genetic testing in Korea. And the DTC genetic testing performance or method has been rarely reported to the scientific and/or medical community and reliability of DTC genetic testing needs to be assessed. Law enforcement needs to improve these issues. Also principle of transparency needs to be applied.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42760572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Deep learning for stage prediction in neuroblastoma using gene expression data","authors":"Aron Park, S. Nam","doi":"10.5808/GI.2019.17.3.e30","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e30","url":null,"abstract":"Neuroblastoma is a major cause of cancer death in early childhood, and its timely and correct diagnosis is critical. Gene expression datasets have recently been considered as a powerful tool for cancer diagnosis and subtype classification. However, no attempts have yet been made to apply deep learning using gene expression to neuroblastoma classification, although deep learning has been applied to cancer diagnosis using image data. Taking the International Neuroblastoma Staging System stages as multiple classes, we designed a deep neural network using the gene expression patterns and stages of neuroblastoma patients. Despite a small patient population (n = 280), stage 1 and 4 patients were well distinguished. If it is possible to replicate this approach in a larger population, deep learning could play an important role in neuroblastoma staging.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49302915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Trends in Genomics & Informatics: a statistical review of publications from 2003 to 2018 focusing on the most-studied genes and document clusters","authors":"Jihyeon Kim, Hee-Jo Nam, Hyun-Seok Park","doi":"10.5808/GI.2019.17.3.e25","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e25","url":null,"abstract":"Genomics & Informatics (NLM title abbreviation: Genomics Inform) is the official journal of the Korea Genome Organization. Herein, we conduct a statistical analysis of the publications of Genomics & Informatics over the 16 years since its inception, with a particular focus on issues relating to article categories, word clouds, and the most-studied genes, drawing on recent reviews of the use of word frequencies in journal articles. Trends in the studies published in Genomics & Informatics are discussed both individually and collectively.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44453929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dong-Uk Kim, Minho Lee, Sangjo Han, Miyoung Nam, Sol Lee, Jaewoong Lee, Jihye Woo, Dongsup Kim, K. Hoe
{"title":"Optimization of a microarray for fission yeast","authors":"Dong-Uk Kim, Minho Lee, Sangjo Han, Miyoung Nam, Sol Lee, Jaewoong Lee, Jihye Woo, Dongsup Kim, K. Hoe","doi":"10.5808/GI.2019.17.3.e28","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e28","url":null,"abstract":"Bar-code (tag) microarrays of yeast gene-deletion collections facilitate the systematic identification of genes required for growth in any condition of interest. Anti-sense strands of amplified bar-codes hybridize with ~10,000 (5,000 each for up- and down-tags) different kinds of sense-strand probes on an array. In this study, we optimized the hybridization processes of an array for fission yeast. Compared to the first version of the array (11 µm, 100K) consisting of three sectors with probe pairs (perfect match and mismatch), the second version (11 µm, 48K) could represent ~10,000 up-/down-tags in quadruplicate along with 1,508 negative controls in quadruplicate and a single set of 1,000 unique negative controls at random dispersed positions without mismatch pairs. For PCR, the optimal annealing temperature (maximizing yield and minimizing extra bands) was 58℃ for both tags. Intriguingly, up-tags required 3× higher amounts of blocking oligonucleotides than down-tags. A 1:1 mix ratio between up- and down-tags was satisfactory. A lower temperature (25℃) was optimal for cultivation instead of a normal temperature (30℃) because of extra temperature-sensitive mutants in a subset of the deletion library. Activation of frozen pooled cells for >1 day showed better resolution of intensity than no activation. A tag intensity analysis showed that tag(s) of 4,316 of the 4,526 strains tested were represented at least once; 3,706 strains were represented by both tags, 4,072 strains by up-tags only, and 3,950 strains by down-tags only. The results indicate that this microarray will be a powerful analytical platform for elucidating currently unknown gene functions.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42657769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identification of neoantigens derived from alternative splicing and RNA modification","authors":"Jiyeon Park, Y. Chung","doi":"10.5808/GI.2019.17.3.e23","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e23","url":null,"abstract":"The acquisition of somatic mutations is the most common event in cancer. Neoantigens expressed from genes with mutations acquired during carcinogenesis can be tumor-specific. Since the immune system recognizes tumor-specific peptides, they are potential targets for personalized neoantigen-based immunotherapy. However, the discovery of druggable neoantigens remains challenging, suggesting that a deeper understanding of the mechanism of neoantigen generation and better strategies to identify them will be required to realize the promise of neoantigen-based immunotherapy. Alternative splicing and RNA editing events are emerging mechanisms leading to neoantigen production. In this review, we outline recent work involving the large-scale screening of neoantigens produced by alternative splicing and RNA editing. We also describe strategies to predict and validate neoantigens from RNA sequencing data.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48834315","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"FusionScan: accurate prediction of fusion genes from RNA-Seq data","authors":"P. Kim, Y. Jang, Sanghyuk Lee","doi":"10.5808/GI.2019.17.3.e26","DOIUrl":"https://doi.org/10.5808/GI.2019.17.3.e26","url":null,"abstract":"Identification of fusion gene is of prominent importance in cancer research field because of their potential as carcinogenic drivers. RNA sequencing (RNA-Seq) data have been the most useful source for identification of fusion transcripts. Although a number of algorithms have been developed thus far, most programs produce too many false-positives, thus making experimental confirmation almost impossible. We still lack a reliable program that achieves high precision with reasonable recall rate. Here, we present FusionScan, a highly optimized tool for predicting fusion transcripts from RNA-Seq data. We specifically search for split reads composed of intact exons at the fusion boundaries. Using 269 known fusion cases as the reference, we have implemented various mapping and filtering strategies to remove false-positives without discarding genuine fusions. In the performance test using three cell line datasets with validated fusion cases (NCI-H660, K562, and MCF-7), FusionScan outperformed other existing programs by a considerable margin, achieving the precision and recall rates of 60% and 79%, respectively. Simulation test also demonstrated that FusionScan recovered most of true positives without producing an overwhelming number of false-positives regardless of sequencing depth and read length. The computation time was comparable to other leading tools. We also provide several curative means to help users investigate the details of fusion candidates easily. We believe that FusionScan would be a reliable, efficient and convenient program for detecting fusion transcripts that meet the requirements in the clinical and experimental community. FusionScan is freely available at http://fusionscan.ewha.ac.kr/.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":" ","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46412581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Richard Eckart de Castilho, Nancy Ide, Jin-Dong Kim, Jan-Christoph Klie, Keith Suderman
{"title":"Towards cross-platform interoperability for machine-assisted text annotation","authors":"Richard Eckart de Castilho, Nancy Ide, Jin-Dong Kim, Jan-Christoph Klie, Keith Suderman","doi":"10.5808/GI.2019.17.2.e19","DOIUrl":"https://doi.org/10.5808/GI.2019.17.2.e19","url":null,"abstract":"In this paper, we investigate cross-platform interoperability for natural language processing (NLP) and, in particular, annotation of textual resources, with an eye toward identifying the design elements of annotation models and processes that are particularly problematic for, or amenable to, enabling seamless communication across different platforms. The study is conducted in the context of a specific annotation methodology, namely machine-assisted interactive annotation (also known as human-in-the-loop annotation). This methodology requires the ability to freely combine resources from different document repositories, access a wide array of NLP tools that automatically annotate corpora for various linguistic phenomena, and use a sophisticated annotation editor that enables interactive manual annotation coupled with on-the-fly machine learning. We consider three independently developed platforms, each of which utilizes a different model for representing annotations over text, and each of which performs a different role in the process.","PeriodicalId":94288,"journal":{"name":"Genomics & informatics","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41498901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}