{"title":"Unread, yet preserved: A case study on survival of the 19th-century printed poetry","authors":"Antonina Martynenko","doi":"10.15446/lthc.v25n2.108775","DOIUrl":null,"url":null,"abstract":"Distant reading promises access to \"the great unread\", which should allow scholars to rethink the history of literature. However, the rise in volume of data does not guarantee the understanding of a corpus and its relation to the literary population. This article discusses how a \"complete\" corpus of the 19th-century poetry books in Russian might be collected with account for historical data and potential survivorship bias. Even if bibliographical sources cannot provide a complete list of books printed in a given period, the amount of \"incompleteness\" can be directly estimated with the unseen species models. The estimation of survival ratios for printed poetry shows differences in the loss rate across different types of sources: with conventional editions, like books and anthologies, are well-preserved, while booklets and pamphlets are the largest expected source of loss. These findings allow us to estimate what an \"exhaustive\" corpus can look like and define the features of \"the unread\" and \"unseen\" inside it.","PeriodicalId":36251,"journal":{"name":"Literatura","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Literatura","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15446/lthc.v25n2.108775","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Arts and Humanities","Score":null,"Total":0}
引用次数: 0
Abstract
Distant reading promises access to "the great unread", which should allow scholars to rethink the history of literature. However, the rise in volume of data does not guarantee the understanding of a corpus and its relation to the literary population. This article discusses how a "complete" corpus of the 19th-century poetry books in Russian might be collected with account for historical data and potential survivorship bias. Even if bibliographical sources cannot provide a complete list of books printed in a given period, the amount of "incompleteness" can be directly estimated with the unseen species models. The estimation of survival ratios for printed poetry shows differences in the loss rate across different types of sources: with conventional editions, like books and anthologies, are well-preserved, while booklets and pamphlets are the largest expected source of loss. These findings allow us to estimate what an "exhaustive" corpus can look like and define the features of "the unread" and "unseen" inside it.