The multifunctional LBC-Corpora: different aims depending on the user

IF 0.1 0 LANGUAGE & LINGUISTICS

LEXICOGRAPHICA Pub Date : 2023-11-01 DOI:10.1515/lex-2023-0010

C. Flinz

{"title":"The multifunctional LBC-Corpora: different aims depending on the user","authors":"C. Flinz","doi":"10.1515/lex-2023-0010","DOIUrl":null,"url":null,"abstract":"Abstract Corpora are nowadays the primary source of many dictionaries and the core element of various platforms and information systems. Lexicographers have therefore a variety of new possibilities which were unthinkable in the past with different types of corpora available for use in the data collection phase of the lexicographic process (Flinz 2021). Not only lexicographers, but also other types of users (academics, translators, teachers, students etc.) can profit from them, especially when corpora are public and can be accessed using corpus linguistic tools (Ballestracci/Buffagni/Flinz 2020; Flinz/Farina 2020). The LBC-corpora are monolingual specialized comparable corpora, already online (http://corpora.lessicobeniculturali.net/) and, as monitor corpora (Lemnitzer/Zinsmeister 2015: 140), they will be augmented over time. They can be analysed using the open source tool NoSketchEngine (Billero 2020). The LBC-Corpora are also the lexicographic primary source of the LBC multilingual dictionary, which is in preparation: the provisional entry lists of different languages (Spanish, German, and French) are now ready (Billero/Farina/Nicolás Martínez 2020) and together with a selection of KWICS, which have been carefully selected following a quantitative-qualitative procedure (for German see Buffagni/Flinz/Ballestracci in prep.), will soon be online (Flinz et al. in prep.). The purpose of this paper is to reflect on the LBC-corpora from a double perspective: from the user of the LBC-platform and from the lexicographic team. In the first case following an overview of the principal characteristics of the LBC-Platform, the focus will be on the accessible corpora showing the tools which can be used (§ 2). In the second case the LBC-corpora will be examined in their function as a data basis for the LBC-Dictionary (§ 3). The attention will be on the data preparation phase: after discussing the procedure for the realization of the LBC-provisional lemma candidate lists, the focus will be on the adopted procedure for finding equivalence relations and for the individuation of other types of relations between the entries (synonymy, belonging to the same semantic field etc.). In § 4 the focus will be on the LBC-provisional lemma candidate lists and their related KWICs. Conclusions and an outlook to the future can be found in the last section (§ 5).","PeriodicalId":29876,"journal":{"name":"LEXICOGRAPHICA","volume":"138 1","pages":"191 - 208"},"PeriodicalIF":0.1000,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"LEXICOGRAPHICA","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/lex-2023-0010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"0","JCRName":"LANGUAGE & LINGUISTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract Corpora are nowadays the primary source of many dictionaries and the core element of various platforms and information systems. Lexicographers have therefore a variety of new possibilities which were unthinkable in the past with different types of corpora available for use in the data collection phase of the lexicographic process (Flinz 2021). Not only lexicographers, but also other types of users (academics, translators, teachers, students etc.) can profit from them, especially when corpora are public and can be accessed using corpus linguistic tools (Ballestracci/Buffagni/Flinz 2020; Flinz/Farina 2020). The LBC-corpora are monolingual specialized comparable corpora, already online (http://corpora.lessicobeniculturali.net/) and, as monitor corpora (Lemnitzer/Zinsmeister 2015: 140), they will be augmented over time. They can be analysed using the open source tool NoSketchEngine (Billero 2020). The LBC-Corpora are also the lexicographic primary source of the LBC multilingual dictionary, which is in preparation: the provisional entry lists of different languages (Spanish, German, and French) are now ready (Billero/Farina/Nicolás Martínez 2020) and together with a selection of KWICS, which have been carefully selected following a quantitative-qualitative procedure (for German see Buffagni/Flinz/Ballestracci in prep.), will soon be online (Flinz et al. in prep.). The purpose of this paper is to reflect on the LBC-corpora from a double perspective: from the user of the LBC-platform and from the lexicographic team. In the first case following an overview of the principal characteristics of the LBC-Platform, the focus will be on the accessible corpora showing the tools which can be used (§ 2). In the second case the LBC-corpora will be examined in their function as a data basis for the LBC-Dictionary (§ 3). The attention will be on the data preparation phase: after discussing the procedure for the realization of the LBC-provisional lemma candidate lists, the focus will be on the adopted procedure for finding equivalence relations and for the individuation of other types of relations between the entries (synonymy, belonging to the same semantic field etc.). In § 4 the focus will be on the LBC-provisional lemma candidate lists and their related KWICs. Conclusions and an outlook to the future can be found in the last section (§ 5).

查看原文本刊更多论文

多功能 LBC-Corpora：因用户而异的目标

摘要语料库如今是许多词典的主要来源，也是各种平台和信息系统的核心要素。因此，词典编纂者可以在词典编纂过程的数据收集阶段使用不同类型的语料库，从而获得过去无法想象的各种新的可能性（Flinz 2021）。不仅词典编纂者，其他类型的用户（学者、翻译者、教师、学生等）也可以从中获益，尤其是当语料库是公开的并且可以使用语料库语言学工具访问时（Ballestracci/Buffagni/Flinz 2020；Flinz/Farina 2020）。LBC-语料库是单语的专业可比语料库，已经上线（http://corpora.lessicobeniculturali.net/），作为监控语料库（Lemnitzer/Zinsmeister 2015: 140），它们将随着时间的推移不断扩充。这些语料库可使用开源工具 NoSketchEngine（Billero 2020）进行分析。LBC-Corpora 也是正在编写的 LBC 多语种词典的主要词源：不同语言（西班牙语、德语和法语）的临时词条列表现已准备就绪（Billero/Farina/Nicolás Martínez 2020），连同经过定量-定性程序精心挑选的部分 KWICS（德语见 Buffagni/Flinz/Ballestracci in prep.本文的目的是从两个角度对 LBC 法群进行反思：从 LBC 平台用户和词典编纂团队的角度。在第一种情况下，在概述 LBC 平台的主要特点之后，重点将放在可访问的语料库上，并展示可使用的工具（第 2 节）。在第二种情况下，将研究 LBC 语料库作为 LBC 词典数据基础的功能（第 3 节）。重点将放在数据准备阶段：在讨论了实现 LBC-预备词目候选列表的程序之后，重点将放在所采用的查找等价关系和词条之间其他类型关系（同义、属于同一语义领域等）的程序上。在第 4 节中，重点将放在 LBC-预设词目候选列表及其相关的 KWIC 上。最后一节（第 5 节）是结论和对未来的展望。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

LEXICOGRAPHICA Multiple-

CiteScore

0.70

自引率

0.00%

发文量