生物多样性并没有在小说中下降

Q1 Arts and Humanities
Andrew Piper
{"title":"生物多样性并没有在小说中下降","authors":"Andrew Piper","doi":"10.22148/001c.38739","DOIUrl":null,"url":null,"abstract":"This paper attempts to replicate the findings of the recent work, “The rise and fall of biodiversity in literature,” by Langer et al. (2021). Using a large corpus from Project Gutenberg (N = ~15,000) and a dictionary-matching method of over 240K biological taxa, Langer et al. find that the frequency and diversity of biological taxa have been declining steadily since the first half of the nineteenth century, echoing prior work in cultural analytics. This paper applies the original paper’s three primary measures to two additional data sets along with the original dataset and compares their dictionary-based method with an alternative supervised machine learning method. I find that the trajectory of biological tokens in fiction in the new data sets is directionally opposite to that shown by Langer et al. independent of the methods used (i.e. taxa rise rather than fall since the first half of the nineteenth century) but that their breakpoint estimation appears largely robust within +/- 15 years. Based on this analysis, I suggest that the discrepancy between our results is due to corpus construction rather than choice of method. I find that only conditioning on fiction in the original dataset generates results more similar to the two alternative datasets used here. In addition to emphasizing the importance of corpus construction for cultural analytics, these findings also raise larger questions about the difficulties of interpreting lexical items as indeces of social attitudes, pointing to a need for future work.","PeriodicalId":33005,"journal":{"name":"Journal of Cultural Analytics","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Biodiversity is not declining in fiction\",\"authors\":\"Andrew Piper\",\"doi\":\"10.22148/001c.38739\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper attempts to replicate the findings of the recent work, “The rise and fall of biodiversity in literature,” by Langer et al. (2021). Using a large corpus from Project Gutenberg (N = ~15,000) and a dictionary-matching method of over 240K biological taxa, Langer et al. find that the frequency and diversity of biological taxa have been declining steadily since the first half of the nineteenth century, echoing prior work in cultural analytics. This paper applies the original paper’s three primary measures to two additional data sets along with the original dataset and compares their dictionary-based method with an alternative supervised machine learning method. I find that the trajectory of biological tokens in fiction in the new data sets is directionally opposite to that shown by Langer et al. independent of the methods used (i.e. taxa rise rather than fall since the first half of the nineteenth century) but that their breakpoint estimation appears largely robust within +/- 15 years. Based on this analysis, I suggest that the discrepancy between our results is due to corpus construction rather than choice of method. I find that only conditioning on fiction in the original dataset generates results more similar to the two alternative datasets used here. In addition to emphasizing the importance of corpus construction for cultural analytics, these findings also raise larger questions about the difficulties of interpreting lexical items as indeces of social attitudes, pointing to a need for future work.\",\"PeriodicalId\":33005,\"journal\":{\"name\":\"Journal of Cultural Analytics\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Cultural Analytics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.22148/001c.38739\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"Arts and Humanities\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Cultural Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22148/001c.38739","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0

摘要

本文试图复制Langer等人最近的工作“文献中生物多样性的兴衰”的发现。(2021)。使用古腾堡项目的大型语料库(N=~15000)和超过240K个生物分类群的字典匹配方法,Langer等人发现,自19世纪上半叶以来,生物分类群的频率和多样性一直在稳步下降,这与之前在文化分析方面的工作相呼应。本文将原始论文的三个主要度量与原始数据集一起应用于两个额外的数据集,并将其基于字典的方法与另一种监督机器学习方法进行比较。我发现,新数据集中小说中生物标记的轨迹与Langer等人所示的轨迹方向相反。独立于所使用的方法(即自19世纪上半叶以来分类群的上升而不是下降),但它们的断点估计在+/-15年内似乎很大程度上是稳健的。基于这一分析,我认为我们的结果之间的差异是由于语料库的构建,而不是方法的选择。我发现,只有在原始数据集中以虚构为条件,才会产生与此处使用的两个备选数据集更相似的结果。除了强调语料库构建对文化分析的重要性外,这些发现还提出了更大的问题,即将词汇项目解释为社会态度的独立因素的困难,指出了未来工作的必要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Biodiversity is not declining in fiction
This paper attempts to replicate the findings of the recent work, “The rise and fall of biodiversity in literature,” by Langer et al. (2021). Using a large corpus from Project Gutenberg (N = ~15,000) and a dictionary-matching method of over 240K biological taxa, Langer et al. find that the frequency and diversity of biological taxa have been declining steadily since the first half of the nineteenth century, echoing prior work in cultural analytics. This paper applies the original paper’s three primary measures to two additional data sets along with the original dataset and compares their dictionary-based method with an alternative supervised machine learning method. I find that the trajectory of biological tokens in fiction in the new data sets is directionally opposite to that shown by Langer et al. independent of the methods used (i.e. taxa rise rather than fall since the first half of the nineteenth century) but that their breakpoint estimation appears largely robust within +/- 15 years. Based on this analysis, I suggest that the discrepancy between our results is due to corpus construction rather than choice of method. I find that only conditioning on fiction in the original dataset generates results more similar to the two alternative datasets used here. In addition to emphasizing the importance of corpus construction for cultural analytics, these findings also raise larger questions about the difficulties of interpreting lexical items as indeces of social attitudes, pointing to a need for future work.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Cultural Analytics
Journal of Cultural Analytics Arts and Humanities-Literature and Literary Theory
CiteScore
2.90
自引率
0.00%
发文量
9
审稿时长
10 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信