{"title":"Ontologies and data modeling","authors":"Øyvind Eide, C. Ore","doi":"10.4324/9781315552941-8","DOIUrl":"https://doi.org/10.4324/9781315552941-8","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125592303","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Data modeling in a digital humanities context","authors":"Julia Flanders, Fotis Jannidis","doi":"10.4324/9781315552941-1","DOIUrl":"https://doi.org/10.4324/9781315552941-1","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116171921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Linguistic and computational modeling in language science","authors":"E. Teich, Péter Fankhauser","doi":"10.4324/9781315552941-12","DOIUrl":"https://doi.org/10.4324/9781315552941-12","url":null,"abstract":"historical perspectives. When practiced as a science, linguistics is characterized by the tension between the two methodological dispositions of rationalism and empiricism. At any point in time in the history of linguistics, one is more dominant than the other. In the last two decades, we have been experiencing a new wave of empiricism in linguistic fields as diverse as psycholinguistics (e.g., Chater et al., 2015), language typology (e.g., Piantidosi and Gibson, 2014), language change (e.g., Bybee, 2010) and language variation (e.g., Bresnan and Ford, 2010). Consequently, the practices of modeling are being renegotiated in different linguistic communities, readdressing some fundamental methodological questions such as: How to cast a research question into an appropriate study design? How to obtain evidence (data) for a hypothesis (e.g., experiment vs. corpus)? How to process the data? How to evaluate a hypothesis in the light of the data obtained? This new empiricism is characterized by an interest in language use in context accompanied by a commitment to computational modeling, which is probably most developed in psycholinguistics, giving rise to the field of “computational psycholinguistics” (cf. Crocker, 2010), but recently getting stronger also in corpus linguistics. The predominant domain of corpus linguistics is language variation, aiming at statements on relative differences/similarities between linguistic varieties (time periods, registers, genres). Corpus analysis is thus comparative by nature; technically, this involves comparing probability distributions of (sets of) linguistic features (e.g., the relative frequency of passive vs. active voice in narrative vs. expository genres) and assessing whether they are significantly different or not. Here, descriptive statistical techniques come into play but also language modeling and machine learning methods (e.g., clustering, latent semantic analysis, or Bayesian modeling). Similarly, corpus processing—that is, preparing text material for analysis—relies on computational models, for example, for annotation. What is important to note here is that processing and analysis are broken up into different steps, each using a different computational micro-model that takes care of a specific task (e.g., labeling linguistic units in annotation) and consists of a descriptive component (set of allowed labels) and an analytic or algorithmic component (procedure by which labels are assigned). Linguistic and computational modeling in language science","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122949749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling the actual, simulating the possible 1","authors":"W. McCarty","doi":"10.4324/9781315552941-14","DOIUrl":"https://doi.org/10.4324/9781315552941-14","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122688933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Algorithmic modeling","authors":"T. Underwood","doi":"10.4324/9781315552941-13","DOIUrl":"https://doi.org/10.4324/9781315552941-13","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115142645","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Playing for keeps","authors":"C. M. Sperberg-McQueen","doi":"10.4324/9781315552941-15","DOIUrl":"https://doi.org/10.4324/9781315552941-15","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131857512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Visualizing information","authors":"Isabel Meirelles","doi":"10.4324/9781315552941-7","DOIUrl":"https://doi.org/10.4324/9781315552941-7","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127440600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How subjective is your model?","authors":"E. Pierazzo","doi":"10.4324/9781315552941-4","DOIUrl":"https://doi.org/10.4324/9781315552941-4","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130483853","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Modeling and annotating complex data structures","authors":"Piotr Banski, A. Witt","doi":"10.4324/9781315552941-11","DOIUrl":"https://doi.org/10.4324/9781315552941-11","url":null,"abstract":"Although it is possible to associate an unlimited number of arbitrary, complex layers of annotations with a text, an image, or an audio/video file, the most common applications almost always follow the classical approach: additional information associated with primary data is expressed in an ordered hierarchy, using a tree structure as its underlying data model. The present contribution offers a brief review of the more popular ways of data structuring and highlights some of the problems that each of them is meant to handle. The first part of the present chapter focuses on the most relevant issues of data modeling for researchers in the humanities and reviews the basic kinds of the relevant data models. The second part addresses ways to capture these abstract models in concrete encoding formats available to digital humanists. We focus here on approaches that use XML, but the models can also be applied more generally. Information and communication are tightly related: communication relies on the exchange of information, but just as the individual information containers are determined by many kinds of variables, organizing these containers into higher level structures is vital for ensuring success in transmitting complete and compact messages. Finding the appropriate level of complexity for the structuring of information is one of the key problems in the field of digital humanities. Simple information packages are quick to set up, process and visualize, but as the individual fields of study develop, more and more information needs to be accommodated within a vertically tight space of electronic documents.1 Packaging of complex information raises new theoretical questions and demands new, more efficient, technological solutions. For the purpose of an introductory example, let us assume that the “information containers” are words, subject to the choice of the natural language but also, on the technological plane, to, for example, the selection of the character encoding, such as ISO 8859-1 (known as “Latin-1”) or Unicode. These words are grouped into larger units: phrases, sentences, or utterances. The structure of these larger units, on the one hand, is dictated by the internal syntactic rules of the given language but, on the other, it is also modeled technologically by the selection of Originally published in: Flanders, Julia/Jannidis, Fotis (Eds.): The shape of data in digital humanities. Modeling texts and text-based resources. London [et al.]: Routledge, 2019. Pp. 217-235. (Digital Research in the Arts and Humanities)","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131225828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"How modeling standards evolve","authors":"L. Burnard","doi":"10.4324/9781315552941-3","DOIUrl":"https://doi.org/10.4324/9781315552941-3","url":null,"abstract":"","PeriodicalId":200326,"journal":{"name":"The Shape of Data in the Digital Humanities","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129207870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}