How to read the web in portuguese using the never-ending language learner's principles

M. Duarte, Estevam Hruschka
{"title":"How to read the web in portuguese using the never-ending language learner's principles","authors":"M. Duarte, Estevam Hruschka","doi":"10.1109/ISDA.2014.7066260","DOIUrl":null,"url":null,"abstract":"An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in a dynamic way, be used to expand the scope and improve the performance of the learning task as a whole. The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), is applied to the task of autonomously building a knowledge base as a result of reading the web. Results reported so far reveal that very good results have been achieved when NELL is reading the web in English. When trying, however, to perform the same Machine Reading task (the task of reading the web) applied to web pages written in Portuguese, the previous reported approaches could not keep up with the good performance achieved in English. In this paper we describe an approach, different from previously proposed in the literature, and we present empirical results that corroborate the hypothesis that working on the preprocessing task of a sufficiently big corpus can be key to allow us to use the very same architecture proposed in NELL, but applied to the idea of reading the web in Portuguese (reading, and extracting knowledge from web pages written in Portuguese).","PeriodicalId":328479,"journal":{"name":"2014 14th International Conference on Intelligent Systems Design and Applications","volume":"13 4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 14th International Conference on Intelligent Systems Design and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISDA.2014.7066260","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

An alternative to the traditional single function approximation method is the never-ending learning (NEL) approach i.e., a learning paradigm in which, the learner, in an autonomous way, manages to constantly, incrementally and continuously evolve with time. But, most important than just keep evolving, in this new paradigm acquired knowledge can, in a dynamic way, be used to expand the scope and improve the performance of the learning task as a whole. The first Never-Ending Learning system reported in the literature, which is called NELL (Never-Ending Language Learner), is applied to the task of autonomously building a knowledge base as a result of reading the web. Results reported so far reveal that very good results have been achieved when NELL is reading the web in English. When trying, however, to perform the same Machine Reading task (the task of reading the web) applied to web pages written in Portuguese, the previous reported approaches could not keep up with the good performance achieved in English. In this paper we describe an approach, different from previously proposed in the literature, and we present empirical results that corroborate the hypothesis that working on the preprocessing task of a sufficiently big corpus can be key to allow us to use the very same architecture proposed in NELL, but applied to the idea of reading the web in Portuguese (reading, and extracting knowledge from web pages written in Portuguese).
如何使用永不停息的语言学习者原则在葡萄牙语中阅读网页
传统的单函数近似方法的另一种替代方法是永无止境的学习(NEL)方法,即一种学习范式,在这种学习范式中,学习者以自主的方式不断地、增量地、持续地随着时间进化。但是,比不断发展更重要的是,在这个新范式中,获得的知识可以以一种动态的方式,用于扩大范围并提高整体学习任务的表现。文献中报道的第一个永无止境的学习系统被称为NELL(永无止境的语言学习者),它被应用于通过阅读网络来自主建立知识库的任务。目前所报告的结果表明,NELL在用英语阅读网页时取得了很好的效果。然而,当尝试将同样的机器阅读任务(阅读网页的任务)应用于葡萄牙语编写的网页时,之前报道的方法无法跟上英语所取得的良好性能。在本文中,我们描述了一种方法,不同于之前在文献中提出的方法,我们提出了实证结果,证实了一个假设,即对一个足够大的语料库进行预处理任务可能是允许我们使用NELL中提出的非常相同的架构的关键,但适用于用葡萄牙语阅读网络的想法(阅读,并从葡萄牙语编写的网页中提取知识)。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信