{"title":"On two SweLL learner corpora – SweLL-pilot and SweLL-gold","authors":"Elena Volodina","doi":"10.3384/ecp205012","DOIUrl":"https://doi.org/10.3384/ecp205012","url":null,"abstract":"SweLL – Swedish Learner Language – is a unifying term for the infrastructure module for research on Swedish as a Second Language (L2), deployed and maintained as a part of bigger infrastructure of Språkbanken Text at the University of Gothenburg, Sweden. The SweLL infrastructure module consists of a number of learner data collections, and tools for annotation and management of learner data. As a result, many of its components contain the prefix SweLL in their names, which has created some confusion, especially with regards to the two corpora. In this article we shortly introduce the various SweLL-components with a special focus on the differences between the two SweLL corpora.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"48 18","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Queerlit – a bibliography of Swedish fiction with LGBTQI topics","authors":"Siska Humlesjö, Jenny Bergenmar, Arild Matsson","doi":"10.3384/ecp205005","DOIUrl":"https://doi.org/10.3384/ecp205005","url":null,"abstract":"This paper summarizes the project Queerlit: Metadata and Searchability for LGBTQ+ Literary Heritage 2020-2023 and discusses some challenges in the development of this resource. The Queerlit project consist of four parts: 1. Creating a bibliography of Swedish fiction with LGBTQI themes 2. Creating a Swedish thesaurus (QLIT), adapted from the of the linked open data thesaurus Homosaurus 3. Assigning all material in the bibliography with subject headings from QLIT. 4. A web user interface for searching the material All four parts are integrated with the Swedish union catalog, Libris, making the results of the project available for all under a CC0 license. QLIT is the first external thesaurus integrated in the linked open data framework used in the technical platform of Libris, XL. The bibliography spans from rune stones from the 7th century to recently published fiction. When applying subject headings for the material both general aspects of the work and specific LGBTQI topics are described, making this the most comprehensive retrospective indexing project of Swedish literature to date. The underlying knowledge organization is made a prominent method of interacting with the search interface, which is empirically designed around the needs of various user groups.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"21 2","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"From Zipf distribution to Universal Dependencies – Interactive Notebooks for Swedish Text Analysis","authors":"Dimitrios Kokkinakis","doi":"10.3384/ecp205006","DOIUrl":"https://doi.org/10.3384/ecp205006","url":null,"abstract":"Notebook-based environments are powerful (web-based) interactive development resources for conducting exploratory (textual) data analysis (EDA). These environments allow the embedding of code (code snippets in ‛code cells’) which can be easily executed with the results immediately presented into the user’s window. This paper introduces some basic exploratory tools and techniques using JupyterLab notebooks, applied to Swedish using a subcorpus that address various topics related to the COVID-19 pandemic published during January-December 2021.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"53 19","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Documentation of data making, processing and use facilitates future reuse of research data: the CAPTURE project","authors":"Isto Huvila, Stefan Ekman","doi":"10.3384/ecp205004","DOIUrl":"https://doi.org/10.3384/ecp205004","url":null,"abstract":"Reuse of research data requires knowing what the data is about but also of how it was created and previously processed, interpreted and used. The major challenges in capturing enough – but not too much – such process information, termed paradata, are to know what to document and how to document it in adequate detail and form. This paper showcases research and findings from the ERC-funded research project CAPTURE, which develops in-depth understanding of how paradata is being created and used today and which elicits and explores methods for capturing paradata. From a research infrastructure perspective, the most challenging question for managing paradata is how to enable and support the creation of paradata that is sufficient, relevant for its future reusers, and not too labour-intensive to produce and maintain. Considering the significant extent to which paradata is coincidental and exists because of the lack of data cleaning and management, a major challenge is also how to strike a balance between too much and too little standardisation.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"58 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"STUnD: ett Sökverktyg för Tvåspråkiga Universal Dependencies-trädbanker","authors":"Arianna Masciolini, Márton A. Tóth","doi":"10.3384/ecp205013","DOIUrl":"https://doi.org/10.3384/ecp205013","url":null,"abstract":"Föreliggande artikel introducerar STUND, ett Sökverktyg för Tvåspråkiga Universal Dependencies-trädbanker som möjliggör parallella syntaktiska sökningar. Vi demonstrerar dess praktiska tillämpning i en fallstudie på tempusformen presens perfekt i svenska och engelska. Resultaten visar att presens perfekt används i ungefär lika stor utsträckning i båda språken, men att det förekommer viss variation som verkar bero på språkspecifika konventioner och översättningsstrategier.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"12 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139387464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gerlof Bouma, Markus Forsberg, Justyna Sikora, Emma Sköldberg
{"title":"Kosten att bedriva svensk ordforskning utan att kränka upphovsrätten","authors":"Gerlof Bouma, Markus Forsberg, Justyna Sikora, Emma Sköldberg","doi":"10.3384/ecp205022","DOIUrl":"https://doi.org/10.3384/ecp205022","url":null,"abstract":"Vi beskriver KB-labb och Språkbanken Texts samarbete för att underlätta ordforskning på de upphovsrätts-skyddade korpusar som finns i Kungliga bibliotekets samlingar. Satsningen har hittils lett till två öppna datasamlingar, Kubord 1 och 2, som ger tillgång till ordstatistik och ordsamförekomststatistik. Vi beskriver även Kubord-fastText, en samling vektormodeller som är baserade på samma korpusar, som är under utveckling.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"45 6","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Daniel Brodén, Mats Fridlund, Leif Olsson, Magnus P. Ängsal, Patrik Öhberg
{"title":"Samförfattande som datadriven tvärvetenskap: Pragmatiska lärdomar från SweTerror-projektet","authors":"Daniel Brodén, Mats Fridlund, Leif Olsson, Magnus P. Ängsal, Patrik Öhberg","doi":"10.3384/ecp205020","DOIUrl":"https://doi.org/10.3384/ecp205020","url":null,"abstract":"Terrorism i svensk politik (SweTerror) är ett storskaligt tvärvetenskapligt forskningsprojekt med forskare från såväl human- och samhällsvetenskaperna som datavetenskaperna. Samtidigt använder och utvecklar SweTerror nationell forskningsinfrastruktur för riksdagsdata. Detta paper beskriver användningen av samförfattande som en datadriven tvärvetenskaplig praktik för att integrera olika vetenskapliga perspektiv och skapa samsyn i projektforskningen. Vi tar fasta på betydelsen av valet att koncentrera samarbetsformen kring konferenspapers inom specifikt digital humaniora och diskuterar erfarenheten av att samskrivande försvagar vetenskapligt revirtänkande, liksom ett iterativt förhållningssätt till forskningsdata kopplade till forskningsinfrastrukturer under uppbyggnad. Avslutningsvis betonar vi datadrivet samförfattande som en pragmatisk praktik för att stärka kollaborativt samarbete och kunskapsbryggor inom en tvärvetenskaplig forskargrupp.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"84 9","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ashely Green, Tristan Bridge, Christian Horn, Siska Humlesjö, Aram Karimi, Johan Ling, Jonathan Westin
{"title":"Acessing centuries of documentation - Resources to improve access to Swedish rock art documentation and metadata","authors":"Ashely Green, Tristan Bridge, Christian Horn, Siska Humlesjö, Aram Karimi, Johan Ling, Jonathan Westin","doi":"10.3384/ecp205021","DOIUrl":"https://doi.org/10.3384/ecp205021","url":null,"abstract":"The archive of rock art documentation maintained by SHFA provides a valuable resource to archaeologists and others who study rock art. The archive includes images of rock art documentation, sites, and the documentation process, from the 17th century to the more recent high resolution 3D recording and visualizations. In the last few years, GRIDH, in collaboration with SHFA, have begun to improve access to the archive through a Django-based solution and new digital resources. In this paper, we introduce the images in the archive, provide details on the new digital resources, and reflect on how the new website will impact data availability and rock art research.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"56 12","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139387253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Tradita innovare, innovata tradere","authors":"Lars Borin, Louise Holmer","doi":"10.3384/ecp205007","DOIUrl":"https://doi.org/10.3384/ecp205007","url":null,"abstract":"Swedish computational lexicography has a long history at the University of Gothenburg, both in its primary role as a central aspect of the scientific study of vocabulary and also as an infrastructural component for conducting research based on language data. Starting in the 1960s, the Språkdata research group pioneered corpus-supported lexicography for Swedish, forming the basis for successive editions of the two main descriptive dictionaries of contemporary Swedish, SAOL and SO. Language technological lexical resources for Swedish have been developed by the research unit/research infrastructure Språkbanken Text since the turn of the millennium, most recently in the framework of the Swedish FrameNet++ initiative. After two decades of separation, these two largely mutually independently developed strands of computational lexicography have now joined forces under the umbrella of Språkbanken’s lexical research infrastructure to advance the field technically, methodologically, and scientifically.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"52 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139386551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Open Brain AI: An AI Research Platform","authors":"C. Themistocleous","doi":"10.3384/ecp205001","DOIUrl":"https://doi.org/10.3384/ecp205001","url":null,"abstract":"Language assessment is pivotal in identifying therapeutic interventions for speech, language, and communication disorders stemming from neurogenic origins, developmental or acquired, and student performance in the classroom. Traditional assessment techniques, however, are predominantly manual, necessitating extensive time and effort for administration and scoring. Such procedures can exacerbate the stress experienced by patients. In response to these inherent challenges, we introduced Open Brain AI (https://openbrainai.com). This state-of-the-art computational platform leverages advanced AI methodologies, encompassing machine learning, natural language processing, large language models, and automated speech-to-text transcription. These capabilities enable Open Brain AI to autonomously analyze multilingual spoken and written language productions. This work aims to present the development and evolution of Open Brain AI, elucidating its AI-driven language processing components and the intricate linguistic metrics it employs to evaluate the overarching and granular discourse structures. Open Brain AI significantly reduces the workload on researchers, clinicians, and teachers by facilitating rapid and automated language analysis. It allows healthcare and education professionals to optimize their operational processes, reallocating precious time and resources to more personalized user interactions. Moreover, Open Brain AI provides clinicians, researchers, and educators the autonomy to undertake essential data analytics, freeing up more bandwidth to focus on other vital facets of therapeutic intervention and care.","PeriodicalId":285622,"journal":{"name":"Linköping Electronic Conference Proceedings","volume":"40 11","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139384277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}