{"title":"Deep learning-based lexical character identification in TV series","authors":"Paola Dalla Torre, Paolo Fantozzi, Maurizio Naldi","doi":"10.1093/llc/fqad068","DOIUrl":"https://doi.org/10.1093/llc/fqad068","url":null,"abstract":"Abstract Automated character identification in movies and TV series has been typically carried out through face detection in video and the association of faces with characters’ names extracted from dialogues or cast lists. We propose a deep learning architecture to identify characters based on subtitles only, precisely through the lexicon those characters employ. The identification task is formalized as a multi-class classification task. We apply our technique to the complete set of episodes in the Gomorrah TV series and achieve an average identification accuracy beyond 94 per cent on the full set of characters.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135303958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tracing connections: using network analysis to study trade and movement in the Mediterranean in the 11th to 14th centuries","authors":"Annabel Hancock","doi":"10.1093/llc/fqad056","DOIUrl":"https://doi.org/10.1093/llc/fqad056","url":null,"abstract":"Abstract This study uses network approaches to study late medieval Mediterranean trade and movement and test the validity of using network methods to investigate the past. Historical literature largely focuses on merchant communities and which cities were most central for trade. In this article, two networks, one created from archaeological finds and the other from the writings of four medieval travellers, are analysed using various Social Network Analysis centrality measures and Complex Systems Science models and are compared to each other in order to explore the importance of various Mediterranean settlements and the ways in which movement occurred around the region, investigating whether they challenge or support current understandings. Network methods are shown to be useful approaches with various potential future developments to more fully explore the late medieval Mediterranean. These networks both support and challenge current historiographical views of Mediterranean trade and movement. Many of the same settlements are identified as central, and the importance of islands for movement is supported. However, some smaller settlements, which are infrequently mentioned in current historical literature are revealed as central. Movement also appears to have relied on small stopping points, rather than following express routes between a few important centres.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135303800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Are Zhuzi contentious? A rhetorical investigation of speech/word radicals in ancient Chinese texts","authors":"Jiao Liu, Ke Li","doi":"10.1093/llc/fqad051","DOIUrl":"https://doi.org/10.1093/llc/fqad051","url":null,"abstract":"Abstract In communication, rhetors are inclined to employ contentious rhetorical modes designed to win or compete. Consequently, noncontentious rhetorical modes, such as invitational rhetoric, are underappreciated. This study fosters a better understanding of the rationale and possibility of noncontentious rhetorical modes rooted in texts by traditional Chinese intellectuals. We identify, classify, and interpret indigenous terms identified with speech/word radicals in nine Chinese philosophical classics across five major schools of thought in ancient China using a corpus-driven approach and key concepts of rhetorical studies to delineate the pattern, components, and modes of ancient Chinese rhetoric. The results show that (1) characters with speech/word radicals in ancient Chinese texts follow a pattern between rank and frequency; (2) basic components of rhetorical acts in ancient China can be described based on these terms, and characteristic rhetorical components are identified upon similarities and differences among five schools of thought; and (3) studying rhetorical modes of ancient Chinese rhetoric with speech/word radicals reveals that intellectuals in ancient China adopted both the contentious modes and the noncontentious modes of rhetoric. This study also demonstrates the possibility of studying semantic radicals in texts through digital methods to delineate ancient Chinese rhetoric.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135482469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"In search of founding era registers: automatic modeling of registers from the corpus of Founding Era American English","authors":"Liina Repo, Brett Hashimoto, Veronika Laippala","doi":"10.1093/llc/fqad049","DOIUrl":"https://doi.org/10.1093/llc/fqad049","url":null,"abstract":"Abstract Registers are situationally defined text varieties, such as letters, essays, or news articles, that are considered to be one of the most important predictors of linguistic variation. Often historical databases of language lack register information, which could greatly enhance their usability (e.g. Early English Books Online). This article examines register variation in Late Modern English and automatic register identification in historical corpora. We model register variation in the corpus of Founding Era American English (COFEA) and develop machine-learning methods for automatic register identification in COFEA. We also extract and analyze the most significant grammatical characteristics estimated by the classifier for the best-predicted registers and found that letters and journals in the 1700s were characterized by informational density. The chosen method enables us to learn more about registers in the Founding Era. We show that some registers can be reliably identified from COFEA, the best overall performance achieved by the deep learning model Bidirectional Encoder Representations from Transformers with an F1-score of 97 per cent. This suggests that deep learning models could be utilized in other studies concerned with historical language and its automatic classification.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134948118","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Review of Introduction to Digital Humanities: Enhancing Scholarship with the Use of Technology. Kathryn C. Wymer","authors":"Zilong Zhong, Lin Fan","doi":"10.1093/llc/fqad053","DOIUrl":"https://doi.org/10.1093/llc/fqad053","url":null,"abstract":"Journal Article Review of Introduction to Digital Humanities: Enhancing Scholarship with the Use of Technology. Kathryn C. Wymer Get access Introduction to Digital Humanities: Enhancing Scholarship with the Use of Technology. Kathryn C. Wymer. New York: Routledge, 2021. 106 pp. ISBN: 978-0-367-71115-3 (P/B) Zilong Zhong, Zilong Zhong Research Institute of Foreign Languages, Beijing Foreign Studies University, China https://orcid.org/0000-0002-8512-4701 Search for other works by this author on: Oxford Academic Google Scholar Lin Fan Lin Fan Artificial Intelligence and Human Languages Lab, Beijing Foreign Studies University, China E-mail: fanlinqd@163.com; fanlin@bfsu.edu.cn Search for other works by this author on: Oxford Academic Google Scholar Digital Scholarship in the Humanities, fqad053, https://doi.org/10.1093/llc/fqad053 Published: 05 October 2023","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"2011 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134948121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A hybrid part-of-speech tagger with annotated Kurdish corpus: advancements in POS tagging","authors":"Dastan Maulud, Karwan Jacksi, Ismael Ali","doi":"10.1093/llc/fqad066","DOIUrl":"https://doi.org/10.1093/llc/fqad066","url":null,"abstract":"Abstract With the rapid growth of online content written in the Kurdish language, there is an increasing need to make it machine-readable and processable. Part of speech (POS) tagging is a critical aspect of natural language processing (NLP), playing a significant role in applications such as speech recognition, natural language parsing, information retrieval, and multiword term extraction. This study details the creation of the DASTAN corpus, the first POS-annotated corpus for the Sorani Kurdish dialect. The corpus, containing 74,258 words and thirty-eight tags, employs a hybrid approach utilizing the bigram hidden Markov model in combination with the Kurdish rule-based approach to POS tagging. This approach addresses two key problems that arise with rule-based approaches, namely misclassified words and ambiguity-related unanalyzed words. The proposed approach’s accuracy was assessed by training and testing it on the DASTAN corpus, yielding a 96% accuracy rate. Overall, this study’s findings demonstrate the effectiveness of the proposed hybrid approach and its potential to enhance NLP applications for Sorani Kurdish.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135482619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Identifying social norm violation in movie plots: from Borat to American Pie","authors":"Yair Neuman, Yochai Cohen, Wenpeng Yin","doi":"10.1093/llc/fqad052","DOIUrl":"https://doi.org/10.1093/llc/fqad052","url":null,"abstract":"Abstract The violation of social norms in TV and cinema is a well-known source of humor and catharsis, and researchers in digital humanities may benefit from the automatic identification of social norm violations. In this article, we introduce a novel methodology for identifying and analyzing the violation of social norms in textual data and illustrate it in the analysis of movie plots. The methodology leans on zero-shot classification, specifically relevant when massive, labeled datasets are unavailable. We test our methodology and provide researchers with (1) a theoretically grounded tool for screening textual data for social norm violation and with new datasets that include (2) 6,806 embarrassing situations from movie plots and their hypothesized violated norm and (3) 3,059 movie plots with their average embarrassment score.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"438 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135482727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Retraction of: Statistical comparison between the alternatives of love in the poems of Sa'adi and Moulana","authors":"","doi":"10.1093/llc/fqad003","DOIUrl":"https://doi.org/10.1093/llc/fqad003","url":null,"abstract":"","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135971429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Retraction of: Data visualization technique to study the conceptual metaphors in Divan of Hafiz and Bustan of Sa'adi","authors":"","doi":"10.1093/llc/fqad001","DOIUrl":"https://doi.org/10.1093/llc/fqad001","url":null,"abstract":"","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":"19 1","pages":""},"PeriodicalIF":0.8,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139363978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Reconstruction of cultural memory through digital storytelling: A case study of Shanghai Memory project","authors":"","doi":"10.1093/llc/fqad044","DOIUrl":"https://doi.org/10.1093/llc/fqad044","url":null,"abstract":"\u0000 This article analyses how digital storytelling (DS) is applied to a digital humanities (DH) research project. It considers the purpose of storytelling and illustrates its use to help to democratize the wider project by including diverse voices and helping to reconstruct cultural memory. How can DS be used as a critical research method to help develop a robust methodology in DH research, particularly for organizing historical and cultural resources to form a story world and addressing biases in the established archival collections? This initiative is the latest phase of the Shanghai Memory project, adding an important additional dimension to the established showcase, A Journey from Wukang Road. Wukang Road, with many historical buildings going back to the colonial era, has important cultural significance as part of the former French Concession. Originally known as Rue de Ferguson, the name was changed in 1943, at the time of the Japanese occupation, seemingly as part of anti-colonial sentiment while China was being encouraged to resist her occupiers. Participation in the storytelling project is facilitated by user generated content and promotion in the Shanghai Library. The aim is to present a clearer storyline about the evolution of Wukang Road, explore its historical context, use the stories and reflections of the ordinary people to balance that of the elites, importantly encouraging inclusion of the vernacular Shanghainese dialect as part of wider movements to protect local languages.","PeriodicalId":45315,"journal":{"name":"Digital Scholarship in the Humanities","volume":" ","pages":""},"PeriodicalIF":0.8,"publicationDate":"2023-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46343155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"文学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}