Automatic classification of experimental models in biomedical literature to support searching for alternative methods to animal experiments.

IF 2 3区工程技术 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Journal of Biomedical Semantics Pub Date : 2023-09-01 DOI:10.1186/s13326-023-00292-w

Mariana Neves, Antonina Klippert, Fanny Knöspel, Juliane Rudeck, Ailine Stolz, Zsofia Ban, Markus Becker, Kai Diederich, Barbara Grune, Pia Kahnau, Nils Ohnesorge, Johannes Pucher, Gilbert Schönfelder, Bettina Bert, Daniel Butzke

{"title":"Automatic classification of experimental models in biomedical literature to support searching for alternative methods to animal experiments.","authors":"Mariana Neves, Antonina Klippert, Fanny Knöspel, Juliane Rudeck, Ailine Stolz, Zsofia Ban, Markus Becker, Kai Diederich, Barbara Grune, Pia Kahnau, Nils Ohnesorge, Johannes Pucher, Gilbert Schönfelder, Bettina Bert, Daniel Butzke","doi":"10.1186/s13326-023-00292-w","DOIUrl":null,"url":null,"abstract":"<p><p>Current animal protection laws require replacement of animal experiments with alternative methods, whenever such methods are suitable to reach the intended scientific objective. However, searching for alternative methods in the scientific literature is a time-consuming task that requires careful screening of an enormously large number of experimental biomedical publications. The identification of potentially relevant methods, e.g. organ or cell culture models, or computer simulations, can be supported with text mining tools specifically built for this purpose. Such tools are trained (or fine tuned) on relevant data sets labeled by human experts. We developed the GoldHamster corpus, composed of 1,600 PubMed (Medline) articles (titles and abstracts), in which we manually identified the used experimental model according to a set of eight labels, namely: \"in vivo\", \"organs\", \"primary cells\", \"immortal cell lines\", \"invertebrates\", \"humans\", \"in silico\" and \"other\" (models). We recruited 13 annotators with expertise in the biomedical domain and assigned each article to two individuals. Four additional rounds of annotation aimed at improving the quality of the annotations with disagreements in the first round. Furthermore, we conducted various machine learning experiments based on supervised learning to evaluate the corpus for our classification task. We obtained more than 7,000 document-level annotations for the above labels. After the first round of annotation, the inter-annotator agreement (kappa coefficient) varied among labels, and ranged from 0.42 (for \"others\") to 0.82 (for \"invertebrates\"), with an overall score of 0.62. All disagreements were resolved in the subsequent rounds of annotation. The best-performing machine learning experiment used the PubMedBERT pre-trained model with fine-tuning to our corpus, which gained an overall f-score of 0.83. We obtained a corpus with high agreement for all labels, and our evaluation demonstrated that our corpus is suitable for training reliable predictive models for automatic classification of biomedical literature according to the used experimental models. Our SMAFIRA - \"Smart feature-based interactive\" - search tool ( https://smafira.bf3r.de ) will employ this classifier for supporting the retrieval of alternative methods to animal experiments. The corpus is available for download ( https://doi.org/10.5281/zenodo.7152295 ), as well as the source code ( https://github.com/mariananeves/goldhamster ) and the model ( https://huggingface.co/SMAFIRA/goldhamster ).</p>","PeriodicalId":15055,"journal":{"name":"Journal of Biomedical Semantics","volume":"14 1","pages":"13"},"PeriodicalIF":2.0000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10472567/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Biomedical Semantics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1186/s13326-023-00292-w","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Current animal protection laws require replacement of animal experiments with alternative methods, whenever such methods are suitable to reach the intended scientific objective. However, searching for alternative methods in the scientific literature is a time-consuming task that requires careful screening of an enormously large number of experimental biomedical publications. The identification of potentially relevant methods, e.g. organ or cell culture models, or computer simulations, can be supported with text mining tools specifically built for this purpose. Such tools are trained (or fine tuned) on relevant data sets labeled by human experts. We developed the GoldHamster corpus, composed of 1,600 PubMed (Medline) articles (titles and abstracts), in which we manually identified the used experimental model according to a set of eight labels, namely: "in vivo", "organs", "primary cells", "immortal cell lines", "invertebrates", "humans", "in silico" and "other" (models). We recruited 13 annotators with expertise in the biomedical domain and assigned each article to two individuals. Four additional rounds of annotation aimed at improving the quality of the annotations with disagreements in the first round. Furthermore, we conducted various machine learning experiments based on supervised learning to evaluate the corpus for our classification task. We obtained more than 7,000 document-level annotations for the above labels. After the first round of annotation, the inter-annotator agreement (kappa coefficient) varied among labels, and ranged from 0.42 (for "others") to 0.82 (for "invertebrates"), with an overall score of 0.62. All disagreements were resolved in the subsequent rounds of annotation. The best-performing machine learning experiment used the PubMedBERT pre-trained model with fine-tuning to our corpus, which gained an overall f-score of 0.83. We obtained a corpus with high agreement for all labels, and our evaluation demonstrated that our corpus is suitable for training reliable predictive models for automatic classification of biomedical literature according to the used experimental models. Our SMAFIRA - "Smart feature-based interactive" - search tool ( https://smafira.bf3r.de ) will employ this classifier for supporting the retrieval of alternative methods to animal experiments. The corpus is available for download ( https://doi.org/10.5281/zenodo.7152295 ), as well as the source code ( https://github.com/mariananeves/goldhamster ) and the model ( https://huggingface.co/SMAFIRA/goldhamster ).

Abstract Image

查看原文本刊更多论文

生物医学文献中实验模型的自动分类，以支持寻找动物实验的替代方法。

目前的动物保护法要求用替代方法替代动物实验，只要这些方法适合达到预期的科学目标。然而，在科学文献中寻找替代方法是一项耗时的任务，需要仔细筛选大量的实验性生物医学出版物。识别潜在的相关方法，例如器官或细胞培养模型，或计算机模拟，可以通过专门为此目的构建的文本挖掘工具来支持。这些工具是在人类专家标记的相关数据集上训练(或微调)的。我们开发了GoldHamster语料库，该语料库由1600篇PubMed (Medline)文章(标题和摘要)组成，其中我们根据一组8个标签手动识别使用的实验模型，即:“体内”、“器官”、“原代细胞”、“不朽细胞系”、“无脊椎动物”、“人类”、“计算机”和“其他”(模型)。我们招募了13名具有生物医学领域专业知识的注释者，并将每篇文章分配给两个人。另外四轮注释旨在提高第一轮中存在分歧的注释的质量。此外，我们进行了各种基于监督学习的机器学习实验，以评估我们分类任务的语料库。我们为上述标签获得了7000多个文档级别的注释。在第一轮标注之后，标注者之间的一致性(kappa系数)在标签之间变化，范围从0.42(“其他”)到0.82(“无脊椎动物”)，总分为0.62。在随后的几轮注释中，所有分歧都得到了解决。表现最好的机器学习实验使用了PubMedBERT预训练模型，并对我们的语料库进行了微调，其总体f分数为0.83。我们获得了一个对所有标签都具有高度一致性的语料库，我们的评估表明，根据使用的实验模型，我们的语料库适合用于训练可靠的生物医学文献自动分类预测模型。我们的SMAFIRA——“基于智能特征的交互式”搜索工具(https://smafira.bf3r.de)将使用这个分类器来支持动物实验替代方法的检索。语料库可以下载(https://doi.org/10.5281/zenodo.7152295)，也可以下载源代码(https://github.com/mariananeves/goldhamster)和模型(https://huggingface.co/SMAFIRA/goldhamster)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Biomedical Semantics MATHEMATICAL & COMPUTATIONAL BIOLOGY-

CiteScore

4.20

自引率

5.30%

发文量

审稿时长

30 weeks

期刊介绍： Journal of Biomedical Semantics addresses issues of semantic enrichment and semantic processing in the biomedical domain. The scope of the journal covers two main areas: Infrastructure for biomedical semantics: focusing on semantic resources and repositories, meta-data management and resource description, knowledge representation and semantic frameworks, the Biomedical Semantic Web, and semantic interoperability. Semantic mining, annotation, and analysis: focusing on approaches and applications of semantic resources; and tools for investigation, reasoning, prediction, and discoveries in biomedicine.