Co-occurrence of Cell Lines, Basal Media and Supplementation in the Biomedical Research Literature

Jessica Cox, Darin McBeath, Corey A. Harper, Ron Daniel
{"title":"Co-occurrence of Cell Lines, Basal Media and Supplementation in the Biomedical Research Literature","authors":"Jessica Cox, Darin McBeath, Corey A. Harper, Ron Daniel","doi":"10.2478/jdis-2020-0016","DOIUrl":null,"url":null,"abstract":"Abstract Purpose The use of in vitro cell culture and experimentation is a cornerstone of biomedical research, however, more attention has recently been given to the potential consequences of using such artificial basal medias and undefined supplements. As a first step towards better understanding and measuring the impact these systems have on experimental results, we use text mining to capture typical research practices and trends around cell culture. Design/methodology/approach To measure the scale of in vitro cell culture use, we have analyzed a corpus of 94,695 research articles that appear in biomedical research journals published in ScienceDirect from 2000–2018. Central to our investigation is the observation that studies using cell culture describe conditions using the typical sentence structure of cell line, basal media, and supplemented compounds. Here we tag our corpus with a curated list of basal medias and the Cellosaurus ontology using the Aho-Corasick algorithm. We also processed the corpus with Stanford CoreNLP to find nouns that follow the basal media, in an attempt to identify supplements used. Findings Interestingly, we find that researchers frequently use DMEM even if a cell line's vendor recommends less concentrated media. We see long-tailed distributions for the usage of media and cell lines, with DMEM and RPMI dominating the media, and HEK293, HEK293T, and HeLa dominating cell lines used. Research limitations Our analysis was restricted to documents in ScienceDirect, and our text mining method achieved high recall but low precision and mandated manual inspection of many tokens. Practical implications Our findings document current cell culture practices in the biomedical research community, which can be used as a resource for future experimental design. Originality/value No other work has taken a text mining approach to surveying cell culture practices in biomedical research.","PeriodicalId":92237,"journal":{"name":"Journal of data and information science (Warsaw, Poland)","volume":"5 1","pages":"161 - 177"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of data and information science (Warsaw, Poland)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jdis-2020-0016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Abstract Purpose The use of in vitro cell culture and experimentation is a cornerstone of biomedical research, however, more attention has recently been given to the potential consequences of using such artificial basal medias and undefined supplements. As a first step towards better understanding and measuring the impact these systems have on experimental results, we use text mining to capture typical research practices and trends around cell culture. Design/methodology/approach To measure the scale of in vitro cell culture use, we have analyzed a corpus of 94,695 research articles that appear in biomedical research journals published in ScienceDirect from 2000–2018. Central to our investigation is the observation that studies using cell culture describe conditions using the typical sentence structure of cell line, basal media, and supplemented compounds. Here we tag our corpus with a curated list of basal medias and the Cellosaurus ontology using the Aho-Corasick algorithm. We also processed the corpus with Stanford CoreNLP to find nouns that follow the basal media, in an attempt to identify supplements used. Findings Interestingly, we find that researchers frequently use DMEM even if a cell line's vendor recommends less concentrated media. We see long-tailed distributions for the usage of media and cell lines, with DMEM and RPMI dominating the media, and HEK293, HEK293T, and HeLa dominating cell lines used. Research limitations Our analysis was restricted to documents in ScienceDirect, and our text mining method achieved high recall but low precision and mandated manual inspection of many tokens. Practical implications Our findings document current cell culture practices in the biomedical research community, which can be used as a resource for future experimental design. Originality/value No other work has taken a text mining approach to surveying cell culture practices in biomedical research.
生物医学研究文献中细胞系、基础培养基和补充物的共存
体外细胞培养和实验的使用是生物医学研究的基石,然而,最近越来越多的人关注使用这种人工基础培养基和未定义补充剂的潜在后果。作为更好地理解和衡量这些系统对实验结果的影响的第一步,我们使用文本挖掘来捕获围绕细胞培养的典型研究实践和趋势。为了衡量体外细胞培养的使用规模,我们分析了2000年至2018年在ScienceDirect上发表的生物医学研究期刊上发表的94,695篇研究文章。我们研究的核心是观察到使用细胞培养的研究使用细胞系,基础培养基和补充化合物的典型句子结构来描述条件。在这里,我们使用Aho-Corasick算法将我们的语料库标记为基础媒体和Cellosaurus本体的策划列表。我们还使用斯坦福CoreNLP对语料库进行处理,以找到遵循基础介质的名词,试图识别使用的补充物。有趣的是,我们发现即使细胞系的供应商推荐浓度较低的培养基,研究人员也经常使用DMEM。我们看到培养基和细胞系的使用呈长尾分布,DMEM和RPMI占主导地位,HEK293、HEK293T和HeLa占主导地位。我们的分析仅限于ScienceDirect中的文档,我们的文本挖掘方法实现了高召回率但低准确率,并且强制手动检查许多令牌。我们的研究结果记录了当前生物医学研究界的细胞培养实践,可作为未来实验设计的资源。没有其他工作采用文本挖掘方法来调查生物医学研究中的细胞培养实践。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信