Caroline A McCormick, Stuart Akeson, Sepideh Tavakoli, Dylan Bloch, Isabel N Klink, Miten Jain, Sara H Rouhanifard
{"title":"Multicellular, IVT-derived, unmodified human transcriptome for nanopore-direct RNA analysis.","authors":"Caroline A McCormick, Stuart Akeson, Sepideh Tavakoli, Dylan Bloch, Isabel N Klink, Miten Jain, Sara H Rouhanifard","doi":"10.46471/gigabyte.129","DOIUrl":"10.46471/gigabyte.129","url":null,"abstract":"<p><p>Nanopore direct RNA sequencing (DRS) enables measurements of RNA modifications. Modification-free transcripts are a practical and targeted control for DRS, providing a baseline measurement for canonical nucleotides within a matched and biologically-derived sequence context. However, these controls can be challenging to generate and carry nanopore-specific nuances that can impact analyses. We produced DRS datasets using modification-free transcripts from <i>in vitro</i> transcription of cDNA from six immortalized human cell lines. We characterized variation across cell lines and demonstrated how these may be interpreted. These data will serve as a versatile control and resource to the community for RNA modification analyses of human transcripts.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte129"},"PeriodicalIF":0.0,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11221353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141499780","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Get Free Copy: a multi-repository search platform for biomedical publications.","authors":"Nodir Kosimkhujaev, Kuan-Lin Huang","doi":"10.46471/gigabyte.126","DOIUrl":"10.46471/gigabyte.126","url":null,"abstract":"<p><p>We introduce Get Free Copy (https://getfreecopy.com), a web-based platform designed to streamline the search for biomedical literature across major repositories like arXiv, bioRxiv, medRxiv, and PubMed Central (PMC). Addressing challenges posed by paywalls and fragmented databases, it offers a unified interface for efficient retrieval of free, legitimate copies of biomedical literature. The platform's implementation involves a Node.js backend and dynamic front-end display, enhancing accessibility and research efficiency. As an open-source project, Get Free Copy represents a significant contribution to the open-access movement, inviting global researcher collaboration for further development.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte126"},"PeriodicalIF":0.0,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11154096/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141285565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genome assembly of the rare and endangered Grantham's camellia, <i>Camellia granthamiana</i>.","authors":"","doi":"10.46471/gigabyte.124","DOIUrl":"10.46471/gigabyte.124","url":null,"abstract":"<p><p>Grantham's camellia (<i>Camellia granthamiana</i> Sealy) is a rare and endangered tea species discovered in Hong Kong in 1955 and endemic to southern China. Despite its high conservation value, the genomic resources of <i>C. granthamiana</i> are limited. Here, we present a chromosome-scale draft genome of the tetraploid <i>C. granthamiana</i> (2<i>n</i> = 4<i>x</i> = 60), combining PacBio long-read sequencing and Omni-C data. The assembled genome size is ∼2.4 Gb, with most sequences anchored to 15 pseudochromosomes resembling a monoploid genome. The genome has high contiguity, with a scaffold N50 of 139.7 Mb, and high completeness (97.8% BUSCO score). Our gene model prediction resulted in 68,032 protein-coding genes (BUSCO score of 90.9%). We annotated 1.65 Gb of repeat content (68.48% of the genome). Our Grantham's camellia genome assembly is a valuable resource for investigating Grantham's camellia's biology, ecology, and phylogenomic relationships with other <i>Camellia</i> species, and provides a foundation for further conservation measures.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte124"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11131091/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141163069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
John Terenzini, Yannan Fan, Melissa Jean-Yi Liu, Laura J Falkenberg
{"title":"Jellyfish in Hong Kong: a citizen science dataset.","authors":"John Terenzini, Yannan Fan, Melissa Jean-Yi Liu, Laura J Falkenberg","doi":"10.46471/gigabyte.125","DOIUrl":"10.46471/gigabyte.125","url":null,"abstract":"<p><p>The Hong Kong Jellyfish Project is a citizen science initiative started in early 2021 to enhance our understanding of jellyfish in Hong Kong. Here, we present a dataset of jellyfish sightings collected by citizen scientists from 2021 through 2023 within local waters. Citizen scientists submitted photographs and other data (time, date, and location) using a website, iNaturalist project, and social media. Sightings were validated using references from the literature. A total of 1,020 usable observations are included in this dataset, showing the occurrence and distribution of jellyfish in Hong Kong in 2021-2023. This dataset is now publicly available and discoverable in the Global Biodiversity Information Facility database and is available for download. This data can be used to enhance our understanding of the biodiversity of local marine ecosystems.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte125"},"PeriodicalIF":0.0,"publicationDate":"2024-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11131163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141163074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Neke Ibeh, Charles Y Feigin, Stephen R Frankenberg, Davis J McCarthy, Andrew J Pask, Irene Gallego Romero
{"title":"<i>De novo</i> transcriptome assembly and genome annotation of the fat-tailed dunnart (<i>Sminthopsis crassicaudata</i>).","authors":"Neke Ibeh, Charles Y Feigin, Stephen R Frankenberg, Davis J McCarthy, Andrew J Pask, Irene Gallego Romero","doi":"10.46471/gigabyte.118","DOIUrl":"10.46471/gigabyte.118","url":null,"abstract":"<p><p>Marsupials exhibit distinctive modes of reproduction and early development that set them apart from their eutherian counterparts and render them invaluable for comparative studies. However, marsupial genomic resources still lag far behind those of eutherian mammals. We present a series of novel genomic resources for the fat-tailed dunnart (<i>Sminthopsis crassicaudata</i>), a mouse-like marsupial that, due to its ease of husbandry and <i>ex-utero</i> development, is emerging as a laboratory model. We constructed a highly representative multi-tissue <i>de novo</i> transcriptome assembly of dunnart RNA-seq reads spanning 12 tissues. The transcriptome includes 2,093,982 assembled transcripts and has a mammalian transcriptome BUSCO completeness score of 93.3%, the highest amongst currently published marsupial transcriptomes. This global transcriptome, along with <i>ab initio</i> predictions, supported annotation of the existing dunnart genome, revealing 21,622 protein-coding genes. Altogether, these resources will enable wider use of the dunnart as a model marsupial and deepen our understanding of mammalian genome evolution.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte118"},"PeriodicalIF":0.0,"publicationDate":"2024-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11091235/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140923702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromosomal-level genome assembly of golden birdwing <i>Troides aeacus</i> (Felder & Felder, 1860).","authors":"","doi":"10.46471/gigabyte.122","DOIUrl":"10.46471/gigabyte.122","url":null,"abstract":"<p><p>The golden birdwing <i>Troides aeacus</i> (Lepidoptera, Papilionidae), a significant species in Asia, faces habitat loss due to urbanization and human activities, necessitating its protection. However, the lack of genomic resources hinders our understanding of their biology and diversity, and impedes our conservation efforts based on genetic information or markers. Here, we present the first chromosomal-level genome assembly of <i>T. aeacus</i> using PacBio SMRT and Omni-C scaffolding technologies. The assembled genome (351 Mb) contains 98.94% of the sequences anchored to 30 pseudo-molecules. The genome assembly has high sequence continuity with contig length N50 = 11.67 Mb and L50 = 14, and scaffold length N50 = 12.2 Mb and L50 = 13. A total of 24,946 protein-coding genes were predicted, with high BUSCO score completeness (98.8% and 94.7% of genome and proteome BUSCO, respectively. This genome offers a significant resource for understanding the swallowtail butterfly biology and carrying out its conservation.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte122"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11068028/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromosomal-level genome assembly of the long-spined sea urchin <i>Diadema setosum</i> (Leske, 1778).","authors":"","doi":"10.46471/gigabyte.121","DOIUrl":"10.46471/gigabyte.121","url":null,"abstract":"<p><p>The long-spined sea urchin <i>Diadema setosum</i> is an algal and coral feeder widely distributed in the Indo-Pacific that can cause severe bioerosion on the reef community. However, the lack of genomic information has hindered the study of its ecology and evolution. Here, we report the chromosomal-level genome (885.8 Mb) of the long-spined sea urchin <i>D. setosum</i> using a combination of PacBio long-read sequencing and Omni-C scaffolding technology. The assembled genome contains a scaffold N50 length of 38.3 Mb, 98.1% of complete BUSCO (Geno, metazoa_odb10) genes (the single copy score is 97.8% and the duplication score is 0.3%), and 98.6% of the sequences are anchored to 22 pseudo-molecules/chromosomes. A total of 27,478 gene models have were annotated, reaching a total of 28,414 transcripts, including 5,384 tRNA and 23,030 protein-coding genes. The high-quality genome of <i>D. setosum</i> presented here is a valuable resource for the ecological and evolutionary studies of this coral reef-associated sea urchin.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte121"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066563/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140860904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chromosome-level genome assembly of the common chiton, <i>Liolophura japonica</i> (Lischke, 1873).","authors":"","doi":"10.46471/gigabyte.123","DOIUrl":"10.46471/gigabyte.123","url":null,"abstract":"<p><p>Chitons (Polyplacophora) are marine molluscs that can be found worldwide from cold waters to the tropics, and play important ecological roles in the environment. However, only two chiton genomes have been sequenced to date. The chiton <i>Liolophura japonica</i> (Lischke, 1873) is one of the most abundant polyplacophorans found throughout East Asia. Our PacBio HiFi reads and Omni-C sequencing data resulted in a high-quality near chromosome-level genome assembly of ∼609 Mb with a scaffold N50 length of 37.34 Mb (96.1% BUSCO). A total of 28,233 genes were predicted, including 28,010 protein-coding ones. The repeat content (27.89%) was similar to that of other Chitonidae species and approximately three times lower than that of the Hanleyidae chiton genome. The genomic resources provided by this work will help to expand our understanding of the evolution of molluscs and the ecological adaptation of chitons.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte123"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11068029/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genome assembly of the edible jelly fungus <i>Dacryopinax spathularia (Dacrymycetaceae)</i>.","authors":"","doi":"10.46471/gigabyte.120","DOIUrl":"10.46471/gigabyte.120","url":null,"abstract":"<p><p>The edible jelly fungus <i>Dacryopinax spathularia</i> (<i>Dacrymycetaceae</i>) is wood-decaying and can be commonly found worldwide. It has found application in food additives, given its ability to synthesize long-chain glycolipids, among other uses. In this study, we present the genome assembly of <i>D. spathularia</i> using a combination of PacBio HiFi reads and Omni-C data. The genome size is 29.2 Mb. It has high sequence contiguity and completeness, with a scaffold N50 of 1.925 Mb and a 92.0% BUSCO score. A total of 11,510 protein-coding genes and 474.7 kb repeats (accounting for 1.62% of the genome) were predicted. The <i>D. spathularia</i> genome assembly generated in this study provides a valuable resource for understanding their ecology, such as their wood-decaying capability, their evolutionary relationships with other fungi, and their unique biology and applications in the food industry.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte120"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066560/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140874143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Genome assembly of the milky mangrove <i>Excoecaria agallocha</i>.","authors":"","doi":"10.46471/gigabyte.119","DOIUrl":"10.46471/gigabyte.119","url":null,"abstract":"<p><p>The milky mangrove <i>Excoecaria agallocha</i> is a latex-secreting mangrove that are distributed in tropical and subtropical regions. While its poisonous latex is regarded as a potential source of phytochemicals for biomedical applications, the genomic resources of <i>E. agallocha</i> remains limited. Here, we present a chromosomal level genome of <i>E. agallocha</i>, assembled from the combination of PacBio long-read sequencing and Omni-C data. The resulting assembly size is 1,332.45 Mb and has high contiguity and completeness with a scaffold N50 of 58.9 Mb and a BUSCO score of 98.4%, with 86.08% of sequences anchored to 18 pseudomolecules. 73,740 protein-coding genes were also predicted. The milky mangrove genome provides a useful resource for further understanding the biosynthesis of phytochemical compounds in <i>E. agallocha</i>.</p>","PeriodicalId":73157,"journal":{"name":"GigaByte (Hong Kong, China)","volume":"2024 ","pages":"gigabyte119"},"PeriodicalIF":0.0,"publicationDate":"2024-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11066562/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140854565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}