International Conference on Language, Data, and Knowledge最新文献

筛选
英文 中文
Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages 资源不足的德拉威语不同正字法的机器翻译比较
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-20 DOI: 10.4230/OASIcs.LDK.2019.6
Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae
{"title":"Comparison of Different Orthographies for Machine Translation of Under-Resourced Dravidian Languages","authors":"Bharathi Raja Chakravarthi, Mihael Arcan, John P. McCrae","doi":"10.4230/OASIcs.LDK.2019.6","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.6","url":null,"abstract":"Under-resourced languages are a significant challenge for statistical approaches to machine translation, and recently it has been shown that the usage of training data from closely-related languages can improve machine translation quality of these languages. While languages within the same language family share many properties, many under-resourced languages are written in their own native script, which makes taking advantage of these language similarities difficult. In this paper, we propose to alleviate the problem of different scripts by transcribing the native script into common representation i.e. the Latin script or the International Phonetic Alphabet (IPA). In particular, we compare the difference between coarse-grained transliteration to the Latin script and fine-grained IPA transliteration. We performed experiments on the language pairs English-Tamil, English-Telugu, and English-Kannada translation task. Our results show improvements in terms of the BLEU, METEOR and chrF scores from transliteration and we find that the transliteration into the Latin script outperforms the fine-grained IPA transcription.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"726-731 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125204863","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 51
A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data 在非结构化和结构化数据中验证位置的双向旅程的建议
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-20 DOI: 10.4230/OASIcs.LDK.2019.13
Ilkcan Keles, Omar Qawasmeh, Tabea Tietz, Ludovica Marinucci, Roberto Reda, M. Erp
{"title":"A Proposal for a Two-Way Journey on Validating Locations in Unstructured and Structured Data","authors":"Ilkcan Keles, Omar Qawasmeh, Tabea Tietz, Ludovica Marinucci, Roberto Reda, M. Erp","doi":"10.4230/OASIcs.LDK.2019.13","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.13","url":null,"abstract":"The Web of Data has grown explosively over the past few years, and as with any dataset, there are bound to be invalid statements in the data, as well as gaps. Natural Language Processing (NLP) is gaining interest to fill gaps in data by transforming (unstructured) text into structured data. However, there is currently a fundamental mismatch in approaches between Linked Data and NLP as the latter is often based on statistical methods, and the former on explicitly modelling knowledge. However, these fields can strengthen each other by joining forces. In this position paper, we argue that using linked data to validate the output of an NLP system, and using textual data to validate Linked Open Data (LOD) cloud statements is a promising research avenue. We illustrate our proposal with a proof of concept on a corpus of historical travel stories.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"448 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116516725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OWLC: A Contextual Two-Dimensional Web Ontology Language OWLC:一种上下文二维Web本体语言
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-20 DOI: 10.4230/OASIcs.LDK.2019.2
Sahar Aljalbout, Didier Buchs, G. Falquet
{"title":"OWLC: A Contextual Two-Dimensional Web Ontology Language","authors":"Sahar Aljalbout, Didier Buchs, G. Falquet","doi":"10.4230/OASIcs.LDK.2019.2","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.2","url":null,"abstract":"Representing and reasoning on contexts is an open problem in the semantic web. Despite the fact that context representation has for a long time been treated locally by semantic web practitioners, a recognized and widely accepted consensus regarding the way of encoding and particularly reasoning on contextual knowledge has not yet been reached by far. In this paper, we present OWL^C : a contextual two-dimensional web ontology language. Using the first dimension, we can reason on contexts-dependent classes, properties, and axioms and using the second dimension, we can reason on knowledge about contexts which we consider formal objects, as proposed by McCarthy [McCarthy, 1987]. We demonstrate the modeling strength and reasoning capabilities of OWL^C with a practical scenario from the digital humanity domain. We chose the Ferdinand de Saussure [Joseph, 2012] use case in virtue of its inherent contextual nature, as well as its notable complexity which allows us to highlight many issues connected with contextual knowledge representation and reasoning.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117221505","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques 使用链接发现和命名实体识别技术连接SciGraph和DBpedia数据集
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-20 DOI: 10.4230/OASICS.LDK.2019.15
Beyza Yaman, Michele Pasin, M. Freudenberg
{"title":"Interlinking SciGraph and DBpedia Datasets Using Link Discovery and Named Entity Recognition Techniques","authors":"Beyza Yaman, Michele Pasin, M. Freudenberg","doi":"10.4230/OASICS.LDK.2019.15","DOIUrl":"https://doi.org/10.4230/OASICS.LDK.2019.15","url":null,"abstract":"In recent years we have seen a proliferation of Linked Open Data (LOD) compliant datasets becoming available on the web, leading to an increased number of opportunities for data consumers to build smarter applications which integrate data coming from disparate sources. However, often the integration is not easily achievable since it requires discovering and expressing associations across heterogeneous data sets. The goal of this work is to increase the discoverability and reusability of the scholarly data by integrating them to highly interlinked datasets in the LOD cloud. In order to do so we applied techniques that a) improve the identity resolution across these two sources using Link Discovery for the structured data (i.e. by annotating Springer Nature (SN) SciGraph entities with links to DBpedia entities), and b) enriching SN SciGraph unstructured text content (document abstracts) with links to DBpedia entities using Named Entity Recognition (NER). We published the results of this work using standard vocabularies and provided an interactive exploration tool which presents the discovered links w.r.t. the breadth and depth of the DBpedia classes.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128950333","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Metalexicography as Knowledge Graph 元编辑学作为知识图谱
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-20 DOI: 10.4230/OASIcs.LDK.2019.19
David Lindemann, Christian Klaes, P. Zumstein
{"title":"Metalexicography as Knowledge Graph","authors":"David Lindemann, Christian Klaes, P. Zumstein","doi":"10.4230/OASIcs.LDK.2019.19","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.19","url":null,"abstract":"This short paper presents preliminary considerations regarding LexBib, a corpus, bibliography, and domain ontology of Lexicography and Dictionary Research, which is currently being developed at University of Hildesheim. The LexBib project is intended to provide a bibliographic metadata collection made available through an online reference platform. The corresponding full texts are processed with text mining methods for the generation of additional metadata, such as term candidates, topic models, and citations. All LexBib content is represented and also publicly accessible as RDF Linked Open Data. We discuss a data model that includes metadata for publication details and for the text mining results, and that considers relevant standards for an integration into the LOD cloud.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133751891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Opening Digitized Newspapers Corpora: Europeana's Full-Text Data Interoperability Case 开放数字化报纸语料库:欧洲全文数据互操作性案例
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-01 DOI: 10.4230/OASIcs.LDK.2019.22
Nuno Freire, Antoine Isaac, Twan Goosen, D. Broeder, Hugo Manguinhas, V. Charles
{"title":"Opening Digitized Newspapers Corpora: Europeana's Full-Text Data Interoperability Case","authors":"Nuno Freire, Antoine Isaac, Twan Goosen, D. Broeder, Hugo Manguinhas, V. Charles","doi":"10.4230/OASIcs.LDK.2019.22","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.22","url":null,"abstract":"Cultural heritage institutions hold collections of printed newspapers that are valuable resources for the study of history, linguistics and other Digital Humanities scientific domains. Effective retrieval of newspapers content based on metadata only is a task nearly impossible, making the retrieval based on (digitized) full-text particularly relevant. Europeana, Europe’s Digital Library, is in the position to provide access to large newspapers collections with full-text resources. Full-text corpora are also relevant for Europeana’s objective of promoting the usage of cultural heritage resources for use within research infrastructures. We have derived requirements for aggregating and publishing Europeana’s newspapers full-text corpus in an interoperable way, based on investigations into the specific characteristics of cultural data, the needs of two research infrastructures (CLARIN and EUDAT) and the practices being promoted in the International Image Interoperability Framework (IIIF) community. We have then defined a “full-text profile” for the Europeana Data Model, which is being applied to Europeana’s newspaper corpus.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133163750","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Validation Methodology for Expert-Annotated Datasets: Event Annotation Case Study 专家注释数据集的验证方法:事件注释案例研究
International Conference on Language, Data, and Knowledge Pub Date : 2019-05-01 DOI: 10.4230/OASIcs.LDK.2019.12
O. Inel, Lora Aroyo
{"title":"Validation Methodology for Expert-Annotated Datasets: Event Annotation Case Study","authors":"O. Inel, Lora Aroyo","doi":"10.4230/OASIcs.LDK.2019.12","DOIUrl":"https://doi.org/10.4230/OASIcs.LDK.2019.12","url":null,"abstract":"Event detection is still a difficult task due to the complexity and the ambiguity of such entities. On the one hand, we observe a low inter-annotator agreement among experts when annotating events, disregarding the multitude of existing annotation guidelines and their numerous revisions. On the other hand, event extraction systems have a lower measured performance in terms of F1-score compared to other types of entities such as people or locations. In this paper we study the consistency and completeness of expert-annotated datasets for events and time expressions. We propose a data-agnostic validation methodology of such datasets in terms of consistency and completeness. Furthermore, we combine the power of crowds and machines to correct and extend expert-annotated datasets of events. We show the benefit of using crowd-annotated events to train and evaluate a state-of-the-art event extraction system. Our results show that the crowd-annotated events increase the performance of the system by at least 5.3%.","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125838392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
An Evaluation Dataset for Linked Data Profiling 关联数据分析的评估数据集
International Conference on Language, Data, and Knowledge Pub Date : 2017-06-19 DOI: 10.1007/978-3-319-59888-8_1
Andrejs Abele, John P. McCrae, P. Buitelaar
{"title":"An Evaluation Dataset for Linked Data Profiling","authors":"Andrejs Abele, John P. McCrae, P. Buitelaar","doi":"10.1007/978-3-319-59888-8_1","DOIUrl":"https://doi.org/10.1007/978-3-319-59888-8_1","url":null,"abstract":"","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127352325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Multi-label Text Classification Using Semantic Features and Dimensionality Reduction with Autoencoders 基于语义特征和自编码器降维的多标签文本分类
International Conference on Language, Data, and Knowledge Pub Date : 2017-06-19 DOI: 10.1007/978-3-319-59888-8_32
Wael Alkhatib, Christoph Rensing, Johannes Silberbauer
{"title":"Multi-label Text Classification Using Semantic Features and Dimensionality Reduction with Autoencoders","authors":"Wael Alkhatib, Christoph Rensing, Johannes Silberbauer","doi":"10.1007/978-3-319-59888-8_32","DOIUrl":"https://doi.org/10.1007/978-3-319-59888-8_32","url":null,"abstract":"","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122396600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Answering the Hard Questions 回答棘手的问题
International Conference on Language, Data, and Knowledge Pub Date : 2017-06-19 DOI: 10.1007/978-3-319-59888-8_22
Maria Khvalchik, Chanin Pithyaachariyakul, Anagha Kulkarni
{"title":"Answering the Hard Questions","authors":"Maria Khvalchik, Chanin Pithyaachariyakul, Anagha Kulkarni","doi":"10.1007/978-3-319-59888-8_22","DOIUrl":"https://doi.org/10.1007/978-3-319-59888-8_22","url":null,"abstract":"","PeriodicalId":377119,"journal":{"name":"International Conference on Language, Data, and Knowledge","volume":"128 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128113836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信