使用词典学来描述生物多样性文献中提到的物种之间的关系

Sandra Young
{"title":"使用词典学来描述生物多样性文献中提到的物种之间的关系","authors":"Sandra Young","doi":"10.1145/3322905.3322918","DOIUrl":null,"url":null,"abstract":"The biodiversity literature is one of the longest-standing examples of recording heritage in the world. Today there are many efforts to standardise and integrate the literature to ensure access to the information, both for heritage and research purposes. Ontologies are increasingly being turned to as knowledge representation tools in these efforts. However, the validity of using ontological frameworks to represent biological taxonomies has been questioned. Biological taxonomies use the scientific nomenclature to assign names to described species. While the nomenclature is a useful classification tool, it can also be a source of confusion because of its synonymous, homonymous and fluid nature. Despite this, no empirical evaluation of scientific nomenclature use in the literature has ever been performed. Corpus-based analysis is already used in automatic ontology extraction, and this study explores the possibility of applying recently developed lexicography techniques to the problem to provide an evaluation of the empirical data in the literature, and serve as a comparison with existing ontologies. This paper focuses on the work flow, parameters and preliminary findings of the research investigating how to extract structures from the literature to perform these comparisons. It uses the manipulation of corpus analysis techniques, visualisation and filtering methods to do so and evaluates potential classification and disambiguation qualities of the resulting graphs for future work. Preliminary results look at the effects of frequency and salience when filtering the graphs, which indicate that these filter parameters could be used for different purposes in revealing relationships between organism mentions.","PeriodicalId":418911,"journal":{"name":"Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using lexicography to characterise relations between species mentions in the biodiversity literature\",\"authors\":\"Sandra Young\",\"doi\":\"10.1145/3322905.3322918\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The biodiversity literature is one of the longest-standing examples of recording heritage in the world. Today there are many efforts to standardise and integrate the literature to ensure access to the information, both for heritage and research purposes. Ontologies are increasingly being turned to as knowledge representation tools in these efforts. However, the validity of using ontological frameworks to represent biological taxonomies has been questioned. Biological taxonomies use the scientific nomenclature to assign names to described species. While the nomenclature is a useful classification tool, it can also be a source of confusion because of its synonymous, homonymous and fluid nature. Despite this, no empirical evaluation of scientific nomenclature use in the literature has ever been performed. Corpus-based analysis is already used in automatic ontology extraction, and this study explores the possibility of applying recently developed lexicography techniques to the problem to provide an evaluation of the empirical data in the literature, and serve as a comparison with existing ontologies. This paper focuses on the work flow, parameters and preliminary findings of the research investigating how to extract structures from the literature to perform these comparisons. It uses the manipulation of corpus analysis techniques, visualisation and filtering methods to do so and evaluates potential classification and disambiguation qualities of the resulting graphs for future work. Preliminary results look at the effects of frequency and salience when filtering the graphs, which indicate that these filter parameters could be used for different purposes in revealing relationships between organism mentions.\",\"PeriodicalId\":418911,\"journal\":{\"name\":\"Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage\",\"volume\":\"11 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3322905.3322918\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 3rd International Conference on Digital Access to Textual Cultural Heritage","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3322905.3322918","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

生物多样性文献是世界上历史最悠久的遗产记录之一。今天,有许多努力标准化和整合文献,以确保获取信息,无论是为了遗产还是研究目的。在这些努力中,本体越来越多地被用作知识表示工具。然而,使用本体框架来表示生物分类的有效性一直受到质疑。生物分类学使用科学的命名法给被描述的物种命名。虽然命名法是一种有用的分类工具,但由于其同义性、同义性和流动性,它也可能成为混淆的根源。尽管如此,尚无文献中科学术语使用的实证评估。基于语料库的分析已经用于自动本体提取,本研究探索了将最新发展的词典编纂技术应用于该问题的可能性,以提供文献中经验数据的评估,并与现有本体进行比较。本文重点介绍了研究的工作流程、参数和初步结果,探讨了如何从文献中提取结构来进行这些比较。它使用语料库分析技术、可视化和过滤方法来操作,并评估结果图的潜在分类和消歧质量,以供将来的工作使用。初步结果考察了过滤图表时频率和显著性的影响,这表明这些过滤参数可以用于揭示生物体提及之间关系的不同目的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Using lexicography to characterise relations between species mentions in the biodiversity literature
The biodiversity literature is one of the longest-standing examples of recording heritage in the world. Today there are many efforts to standardise and integrate the literature to ensure access to the information, both for heritage and research purposes. Ontologies are increasingly being turned to as knowledge representation tools in these efforts. However, the validity of using ontological frameworks to represent biological taxonomies has been questioned. Biological taxonomies use the scientific nomenclature to assign names to described species. While the nomenclature is a useful classification tool, it can also be a source of confusion because of its synonymous, homonymous and fluid nature. Despite this, no empirical evaluation of scientific nomenclature use in the literature has ever been performed. Corpus-based analysis is already used in automatic ontology extraction, and this study explores the possibility of applying recently developed lexicography techniques to the problem to provide an evaluation of the empirical data in the literature, and serve as a comparison with existing ontologies. This paper focuses on the work flow, parameters and preliminary findings of the research investigating how to extract structures from the literature to perform these comparisons. It uses the manipulation of corpus analysis techniques, visualisation and filtering methods to do so and evaluates potential classification and disambiguation qualities of the resulting graphs for future work. Preliminary results look at the effects of frequency and salience when filtering the graphs, which indicate that these filter parameters could be used for different purposes in revealing relationships between organism mentions.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信