{"title":"Qualitative Analysis of Semantic Language Models","authors":"Thibault Clérice, M. Munson","doi":"10.1163/9789004399297_007","DOIUrl":null,"url":null,"abstract":"The task of automatically extracting semantic information from raw textual data is an increasingly important topic in computational linguistics and has begun to make its way into non-linguistic humanities research.1 That this task has been accepted as an important one in computational linguistics is shown by its appearance in the standard text books and handbooks for computational linguistics such as Manning and Schuetze Foundations of Statistical Natural Language Processing2 and Jurafsky and Martin Speech and Language Processing.3 And according to the Association for Computational Linguistics Wiki,4 there have been 25 published experiments which used the TOEFL (Test of English as a Foreign Language) standardized synonym questions to test the performance of algorithmic extraction of semantic information since 1997 with scores ranging from 20% to 100% accuracy. The question addressed by this paper, however, is not whether semantic information can be automatically extracted from textual data. The studies listed in the preceding paragraph have already proven this. It is also not about trying to find the best algorithm to use to do this. Instead, this paper aims to make this widely used and accepted task more useful outside of purely linguistic studies by considering how one can qualitatively assess the results returned by such algorithms. That is, it aims to move the assessment of the results returned by semantic extraction algorithms closer to the actual hermeneutical tasks carried out in the, e.g., historical, cultural, or theological interpretation of texts. We believe that this critical projection of algorithmic results back onto the","PeriodicalId":355737,"journal":{"name":"Ancient Manuscripts in Digital Culture","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ancient Manuscripts in Digital Culture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1163/9789004399297_007","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The task of automatically extracting semantic information from raw textual data is an increasingly important topic in computational linguistics and has begun to make its way into non-linguistic humanities research.1 That this task has been accepted as an important one in computational linguistics is shown by its appearance in the standard text books and handbooks for computational linguistics such as Manning and Schuetze Foundations of Statistical Natural Language Processing2 and Jurafsky and Martin Speech and Language Processing.3 And according to the Association for Computational Linguistics Wiki,4 there have been 25 published experiments which used the TOEFL (Test of English as a Foreign Language) standardized synonym questions to test the performance of algorithmic extraction of semantic information since 1997 with scores ranging from 20% to 100% accuracy. The question addressed by this paper, however, is not whether semantic information can be automatically extracted from textual data. The studies listed in the preceding paragraph have already proven this. It is also not about trying to find the best algorithm to use to do this. Instead, this paper aims to make this widely used and accepted task more useful outside of purely linguistic studies by considering how one can qualitatively assess the results returned by such algorithms. That is, it aims to move the assessment of the results returned by semantic extraction algorithms closer to the actual hermeneutical tasks carried out in the, e.g., historical, cultural, or theological interpretation of texts. We believe that this critical projection of algorithmic results back onto the