Interactive Web-Based Lung Cell Atlases Lower Barriers to Transcriptomic Data Sharing, Mining and Dissemination

C. Cosme Jr, J. Flint, N. Neumark, N. Kothapalli, A. Balayev, T. Adams, J. Schupp, J. Mcdonough, X. Yan, M. Sauler, N. Kaminski
{"title":"Interactive Web-Based Lung Cell Atlases Lower Barriers to Transcriptomic Data Sharing, Mining and Dissemination","authors":"C. Cosme Jr, J. Flint, N. Neumark, N. Kothapalli, A. Balayev, T. Adams, J. Schupp, J. Mcdonough, X. Yan, M. Sauler, N. Kaminski","doi":"10.1164/ajrccm-conference.2022.205.1_meetingabstracts.a4978","DOIUrl":null,"url":null,"abstract":"Rationale: Recent advancements in sequencing technologies have led to a substantial increase in the scale and resolution of transcriptomic data. Despite this progress, accessibility to this data, particularly among those who are coming from non-computational backgrounds is limited. To facilitate improved access and exploration of our single-cell RNA sequencing data, we generated several data sharing, mining and dissemination portals to accompany our idiopathic pulmonary fibrosis (IPF), chronic obstructive pulmonary disease (COPD), and lung endothelial cells (Lung EC) cell atlases. Descriptions and links of each website can be found here: https://medicine.yale.edu/lab/kaminski/research/atlas/. Methods: Each interactive data mining website is coded in the R language using the Shiny package and is hosted by Shinyapps.io. Percell expression data for each website is stored on a MySQL database hosted by Amazon Web Services (AWS). Time-associated website engagement statistics and gene query information is collected for each website using a combination of Google Analytics and a gene search table stored on our MySQL database. User exploration of available data is facilitated through several easy-touse visualization tools available on each website. Results: Website usage statistics since the publication of each website shows that 9,772 unique users from 56 countries and five continents have accessed at least one of the three websites. At the time of writing, 300,748 total queries have been made for 15,627 unique genes across the websites. The top five searched genes for the IPF Cell Atlas are CD14, ACE2, ACTA2, IL11 and MUC5B while for the COPD Cell Atlas they are FAM13A, MIRLET7BHG, HHIP, ISM1 and DDT. Finally, the top searched genes for the Lung Endothelial Cell Atlas are BMPR2, PECAM1, EDNRB, APLNR and PROX1. Of note, interaction with the IPF Cell Atlas increased dramatically at the start of the COVID-19 pandemic, with queries for the ACE2 gene, the putative binding receptor for the SARS-CoV-2 virus, increasing substantially at the pandemic's onset in the United States. Conclusions: Usage statistics, gene query information and feedback from users, both within academia and industry, have shown broad engagement with our websites by individuals across computational and non-computational backgrounds. We envision widespread adoption of web-based portals similar to ours will facilitate novel discoveries within these complex datasets and new scientific collaborations.","PeriodicalId":382658,"journal":{"name":"C109. PULMONARY FIBROSIS: MOVING FORWARD WITH GENETICS AND SEQUENCING","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"C109. PULMONARY FIBROSIS: MOVING FORWARD WITH GENETICS AND SEQUENCING","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1164/ajrccm-conference.2022.205.1_meetingabstracts.a4978","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Rationale: Recent advancements in sequencing technologies have led to a substantial increase in the scale and resolution of transcriptomic data. Despite this progress, accessibility to this data, particularly among those who are coming from non-computational backgrounds is limited. To facilitate improved access and exploration of our single-cell RNA sequencing data, we generated several data sharing, mining and dissemination portals to accompany our idiopathic pulmonary fibrosis (IPF), chronic obstructive pulmonary disease (COPD), and lung endothelial cells (Lung EC) cell atlases. Descriptions and links of each website can be found here: https://medicine.yale.edu/lab/kaminski/research/atlas/. Methods: Each interactive data mining website is coded in the R language using the Shiny package and is hosted by Shinyapps.io. Percell expression data for each website is stored on a MySQL database hosted by Amazon Web Services (AWS). Time-associated website engagement statistics and gene query information is collected for each website using a combination of Google Analytics and a gene search table stored on our MySQL database. User exploration of available data is facilitated through several easy-touse visualization tools available on each website. Results: Website usage statistics since the publication of each website shows that 9,772 unique users from 56 countries and five continents have accessed at least one of the three websites. At the time of writing, 300,748 total queries have been made for 15,627 unique genes across the websites. The top five searched genes for the IPF Cell Atlas are CD14, ACE2, ACTA2, IL11 and MUC5B while for the COPD Cell Atlas they are FAM13A, MIRLET7BHG, HHIP, ISM1 and DDT. Finally, the top searched genes for the Lung Endothelial Cell Atlas are BMPR2, PECAM1, EDNRB, APLNR and PROX1. Of note, interaction with the IPF Cell Atlas increased dramatically at the start of the COVID-19 pandemic, with queries for the ACE2 gene, the putative binding receptor for the SARS-CoV-2 virus, increasing substantially at the pandemic's onset in the United States. Conclusions: Usage statistics, gene query information and feedback from users, both within academia and industry, have shown broad engagement with our websites by individuals across computational and non-computational backgrounds. We envision widespread adoption of web-based portals similar to ours will facilitate novel discoveries within these complex datasets and new scientific collaborations.
基于web的交互式肺细胞图谱降低了转录组学数据共享、挖掘和传播的障碍
理由:最近测序技术的进步导致转录组数据的规模和分辨率大幅增加。尽管取得了这些进展,但获取这些数据的途径,特别是那些来自非计算机背景的人,是有限的。为了促进单细胞RNA测序数据的访问和探索,我们创建了几个数据共享、挖掘和传播门户,以伴随我们的特发性肺纤维化(IPF)、慢性阻塞性肺疾病(COPD)和肺内皮细胞(lung EC)细胞图谱。每个网站的描述和链接可以在这里找到:https://medicine.yale.edu/lab/kaminski/research/atlas/。方法:每个交互式数据挖掘网站使用Shiny包用R语言编码,并由Shinyapps.io托管。每个网站的Percell表达式数据存储在由Amazon Web Services (AWS)托管的MySQL数据库中。与时间相关的网站参与度统计数据和基因查询信息是使用Google Analytics和存储在MySQL数据库中的基因搜索表的组合为每个网站收集的。通过每个网站上提供的几个易于使用的可视化工具,用户可以方便地探索可用数据。结果:自每个网站发布以来的网站使用统计数据显示,来自五大洲56个国家的9772个独立用户访问了三个网站中的至少一个。在撰写本文时,网站上共有15627个独特基因被查询了300,748次。IPF细胞图谱的前5个搜索基因是CD14、ACE2、ACTA2、IL11和MUC5B, COPD细胞图谱的前5个搜索基因是FAM13A、MIRLET7BHG、hip、ISM1和DDT。最后,在肺内皮细胞图谱中搜索最多的基因是BMPR2、PECAM1、EDNRB、APLNR和PROX1。值得注意的是,在COVID-19大流行开始时,与IPF细胞图谱的相互作用急剧增加,对ACE2基因(假定的SARS-CoV-2病毒结合受体)的查询在美国大流行开始时大幅增加。结论:使用统计数据、基因查询信息和来自学术界和工业界用户的反馈表明,来自计算和非计算背景的个人广泛参与我们的网站。我们设想,与我们类似的基于网络的门户网站的广泛采用,将促进在这些复杂数据集中的新发现和新的科学合作。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信