Digital Scholarship in the Humanities最新文献

筛选
英文 中文
Lexical diversity as a lens into the classification of Slavic languages: A quantitative typology perspective 从数量类型学角度看斯拉夫语言分类中的词汇多样性
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-06-09 DOI: 10.1093/llc/fqad042
Chenliang Zhou, Haitao Liu
{"title":"Lexical diversity as a lens into the classification of Slavic languages: A quantitative typology perspective","authors":"Chenliang Zhou, Haitao Liu","doi":"10.1093/llc/fqad042","DOIUrl":"https://doi.org/10.1093/llc/fqad042","url":null,"abstract":"\u0000 This study proposes a linguistic classification method based on quantitative typology, which leverages a large-scale multilingual parallel corpus to obtain valid language classification result by excluding the influence of covariates such as text genre and semantic content in cross-language comparison. To achieve this, we model the type–token relationships of each Slavic parallel text and calculate the lexical diversity to approximate the morphological complexity of the language. We perform automatic clustering of languages based on these lexical diversity metrics. Our findings show that (1) the lexical diversity metrics can well reflect that the language is located somewhere on the continuum of ‘analytism-synthetism’; (2) the automatic clustering based on these metrics effectively reflects the genealogical classification of Slavic languages; and (3) the geographical distribution of lexical diversity in the region where Slavic languages are spoken shows a monotonic increasing trend from southwest to northeast, which is consistent with the pattern found by previous authors on a global scale. The methodological approach taken in this study is data-driven, with the benefit of being independent of theoretical assumptions and easy for computer processing. This approach can offer a better insight into corpus-based typology and may shed light on the understanding of language as a human-driven complex adaptive system.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49490507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Introduction to Digital Humanities: Enhancing Scholarship with the Use of Technology. Kathryn C. Wymer 数字人文导论:利用科技加强学术研究。凯瑟琳·c·怀默
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-06-08 DOI: 10.1093/llc/fqad043
Yali Shi
{"title":"Introduction to Digital Humanities: Enhancing Scholarship with the Use of Technology. Kathryn C. Wymer","authors":"Yali Shi","doi":"10.1093/llc/fqad043","DOIUrl":"https://doi.org/10.1093/llc/fqad043","url":null,"abstract":"","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"61620118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction to: Unravelling interlanguage facts via explainable machine learning 更正:通过可解释的机器学习揭示中介语言事实
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-05-17 DOI: 10.1093/llc/fqad035
{"title":"Correction to: Unravelling interlanguage facts via explainable machine learning","authors":"","doi":"10.1093/llc/fqad035","DOIUrl":"https://doi.org/10.1093/llc/fqad035","url":null,"abstract":"","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44502058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
“I would I had that corporal soundness”: Pervez Rizvi's Analysis of the Word Adjacency Network Method of Authorship Attribution “I would I had that body sound”:Pervez Rizvi对作者归属词邻接网络方法的分析
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-28 DOI: 10.1093/llc/fqad032
G. Egan, Mark Eisen, Alejandro Ribeiro, Santiago Segarra
{"title":"“I would I had that corporal soundness”: Pervez Rizvi's Analysis of the Word Adjacency Network Method of Authorship Attribution","authors":"G. Egan, Mark Eisen, Alejandro Ribeiro, Santiago Segarra","doi":"10.1093/llc/fqad032","DOIUrl":"https://doi.org/10.1093/llc/fqad032","url":null,"abstract":"\u0000 In his two-part article ‘An Analysis of the Word Adjacency Network Method—Part 1—The evidence of its unsoundness’ and ‘Part 2—A true understanding of the method’ Digital Scholarship in the Humanities, 38: 347-78 (2022), Pervez Rizvi attempts to replicate the Word Adjacency Network (WAN) method for authorship attribution and show that it does not produce the new knowledge that we, its inventors, claim for it. In the present essay, we will show that Rizvi misrepresents fundamental aspects of the WAN method, that his attempted replication fails not because the method is flawed but because he erred in replicating it, and that Rizvi misunderstands key aspects of the mathematics of Information Theory that the method uses.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45784964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Provenance visualization: Tracing people, processes, and practices through a data-driven approach to provenance 来源可视化:通过数据驱动的来源方法跟踪人员、过程和实践
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-24 DOI: 10.1093/llc/fqad020
T. Vancisin, Loraine Clarke, M. Orr, Uta Hinrichs
{"title":"Provenance visualization: Tracing people, processes, and practices through a data-driven approach to provenance","authors":"T. Vancisin, Loraine Clarke, M. Orr, Uta Hinrichs","doi":"10.1093/llc/fqad020","DOIUrl":"https://doi.org/10.1093/llc/fqad020","url":null,"abstract":"\u0000 Provenance disclosure—the documentation of an artifact’s origin and how it was produced—is an important aspect to consider when working with historical records which undergo multiple transformations in preparation for and during digitization. Provenance in this context is commonly communicated through explanatory text or static diagrams. However, the methodological and curatorial decisions that have influenced the records’ data are easily overlooked, in particular when exploring the records through visualization as a result of digitization processes. We propose a data-driven approach to provenance disclosure which (1) traces provenance back to when the records were created, (2) documents and categorizes the records’ transformations (transcriptions, content modifications, changes in organization, and representational form), and (3) uses data visualization to disclose provenance in interactive ways. We reflect on how this approach can be practically applied in the context of historical record collections, and we present findings from a qualitative study we conducted to investigate the merits and limitations of provenance-driven visualization. Our findings suggest that data-driven provenance disclosure has the potential to (1) promote transparency and deeper interpretations of historical records, (2) provide rigor in researching historical document collections and underlying production processes, and (3) encourage ethical considerations by making visible labor and implicit bias that influence the production and curation of historical records.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45272916","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proverbs as indicators of proficiency for art-generating AI 谚语作为人工智能艺术生成能力的指标
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-22 DOI: 10.1093/llc/fqad034
Luis J. Tosina Fernández
{"title":"Proverbs as indicators of proficiency for art-generating AI","authors":"Luis J. Tosina Fernández","doi":"10.1093/llc/fqad034","DOIUrl":"https://doi.org/10.1093/llc/fqad034","url":null,"abstract":"\u0000 Art generated by Artificial Intelligence (AI) is currently having great repercussion online. The reason for this is the fact that it allows people without creative talent to produce outstanding works by just typing in the description of what they want to illustrate. However, the appearance of this technology has also caused some discomfort among artists and graphic designers, who see their craft threatened by a service that is available to anyone free of charge. In this article, the capability of some of these platforms to process figurative language will be assessed with the help of five well-known proverbs found in almost identical terms across a number of Western languages. These proverbs were used as the prompts on five of the most popular AI art generators accessible at present. After analyzing the results, our experiment concludes that AI evidences significant deficiencies in the processing of proverbs and, therefore, of figurative language. Consequently, AI does not seem able to substitute human agency completely in artistic creation yet. This exposes an aspect that needs improvement not just for the creative applications of AI but for other applications that it may have in the future. To achieve this, disciplines such as psycholinguistics should be integrated into the teams that develop AI.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47391299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A new approach for the construction of historical databases—NoSQL Document-oriented databases: the example of AtlantoCracies 构建历史数据库的一种新方法——nosql面向文档的数据库:以atlantocracy为例
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-22 DOI: 10.1093/llc/fqad033
Manuel Díaz-Ordóñez, Domingo Savio Rodríguez Baena, Bartolomé Yun-Casalilla
{"title":"A new approach for the construction of historical databases—NoSQL Document-oriented databases: the example of AtlantoCracies","authors":"Manuel Díaz-Ordóñez, Domingo Savio Rodríguez Baena, Bartolomé Yun-Casalilla","doi":"10.1093/llc/fqad033","DOIUrl":"https://doi.org/10.1093/llc/fqad033","url":null,"abstract":"This article proposes, and justifies, the use of the Document-oriented databases as a flexible, easy to use, and powerful digital tool in the field of historical research. First, the reasons that have made relational databases the predominant instrument among historians are studied, while detailing the problems involved in their use. Next, the way in which historians have tried to face these problems by using other digital tools is explained, as well as the limitations that such use entails. Through a case study—that of European aristocratic networks in early modern times—it is shown, however, that Document-oriented databases, present notable advantages and have greater explanatory power for the historian’s work. Thanks to their flexibility, they are better adapted to the often-unpredictable nature of historical sources without diminishing their ease of use or their analytical potential.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43264481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Web archive analytics: Blind spots and silences in distant readings of the archived web 网络档案分析:对存档网络的远距离阅读中的盲点和沉默
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-19 DOI: 10.1093/llc/fqad014
Simon Donig, Markus Eckl, S. Gassner, Malte Rehbein
{"title":"Web archive analytics: Blind spots and silences in distant readings of the archived web","authors":"Simon Donig, Markus Eckl, S. Gassner, Malte Rehbein","doi":"10.1093/llc/fqad014","DOIUrl":"https://doi.org/10.1093/llc/fqad014","url":null,"abstract":"\u0000 In this article, we discuss epistemological and methodological aspects of web archive analytics, a recent development towards more data-centred access to web archives. More specifically, we suggest understanding both the process of archiving and subsequent steps of analysis at scale as acts of observation that can be questioned for their epistemological priori. Therefore, we propose the concepts of ‘blind spots’ (features of the live web not included upon creation in the archive) and ‘silences’ (latent features present in the archive but requiring a particular method to be made articulate). In particular, we address two forms of silences playing a structural role in web archive analytics, crucial to both historians and social scientists alike: abundance (or scale) and time. We trace epistemological implications of web archive analytics across an exemplary case study workflow and suggest methodological answers to the issues raised in this process. On the data extraction side, we introduce warc2corpus (w2c), a new tool for extracting granular, structured data, especially temporal information related to the creation, modification, and publication specifically of webpages. For data analysis, we demonstrate how distant reading techniques—more specifically structural topic modelling (STM)—can contribute to providing a rich, temporally structured representation of textual web archive content that in turn can be subjected to scholarly inquiry, interpretation, and re-contextualization.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46386901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
NEAT—Named Entities in Archaeological Texts: A semantic approach to term extraction and classification NEAT——考古文本中的命名实体:术语提取和分类的语义方法
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-13 DOI: 10.1093/llc/fqad017
Maria Pia di Buono, Gennaro Nolano, J. Monti
{"title":"NEAT—Named Entities in Archaeological Texts: A semantic approach to term extraction and classification","authors":"Maria Pia di Buono, Gennaro Nolano, J. Monti","doi":"10.1093/llc/fqad017","DOIUrl":"https://doi.org/10.1093/llc/fqad017","url":null,"abstract":"\u0000 The lack of annotated datasets affects the development of Natural Language Processing applications and heavily impacts the access to textual data, in particular for specific domains and specific languages. In this paper, we propose a methodology to annotate texts concerning domain-specific knowledge, to provide a reliable source of data for the task of Named Entity Recognition (NER) in the domain of archaeology for the Italian laguage. This method integrates syntactic and semantic information from several structured sources to annotate entities’ mentions in unstructured texts. Furthermore, we make use of an ontology to label entities with the specific type they refer to. By using a corpus made up of item descriptions from Europeana’s Archaeology Collection, we first test our proposed methodology on a mock dataset composed of 1,000 texts. After several steps of improvements, we use the final process to create a complete dataset composed of 5,000 descriptions. The resulting dataset, Named Entities in Archaeological Texts has a total of 41,002 spans of texts annotated with their domain-specific entity classification according to the CIDOC Conceptual Reference Model.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44252712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automatic sentence segmentation for classical Chinese: The Spring and Autumn Annals as an example 文言文的自动分词——以《春秋》为例
IF 0.8 3区 文学
Digital Scholarship in the Humanities Pub Date : 2023-04-12 DOI: 10.1093/llc/fqad016
Wenjie Fan, Dongbo Wang, Shuiqing Huang
{"title":"Automatic sentence segmentation for classical Chinese: The Spring and Autumn Annals as an example","authors":"Wenjie Fan, Dongbo Wang, Shuiqing Huang","doi":"10.1093/llc/fqad016","DOIUrl":"https://doi.org/10.1093/llc/fqad016","url":null,"abstract":"\u0000 There exists no sentence boundary in most classical Chinese literature texts. Since it is difficult to read literature of this kind, experts in literature or linguistics would segment the sentence manually. This article explores the effectiveness of classical Chinese sentence segmentation method so as to provide a reference for classical Chinese punctuation. On the basis of the machine learning methods, we chose three components of machine learning, namely models, tagging schemes, and features, to compare the learning results. The models include conditional random field (CRF) models, long short term memory (LSTM) models, BiLSTM–CRF models, and three Bidirectional Encoder Representation from Transformers (BERT) models. There are five tagging schemes in this article and three features including the statistical feature, Guangyun, and Fanqie. Finally, the performance of the combined feature template is evaluated by ten-fold cross-validation on four classical Chinese texts in different genres. The SikuBERT model is proved to be the most effective model for sentence segmentation at present. Different tagging schemes and various features are introduced. The results show that 5-tag-J tagging schemes can improve performance. Statistical feature, as an important clue for classical Chinese sentence segmentation, is useful in related tasks, but Guangyun and Fanqie have little impact. Other important factors of sentence segmentation are genres and writing styles.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":null,"pages":null},"PeriodicalIF":0.8,"publicationDate":"2023-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43547289","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信