Linguamatica最新文献

筛选
英文 中文
Hacia una clasificación verbal automática para el español: estudio sobre la relevancia de los diferentes tipos y configuraciones de información sintáctico-semántica 西班牙语的自动言语分类:句法语义信息不同类型和配置的相关性研究
IF 0.6
Linguamatica Pub Date : 2015-07-31 DOI: 10.21814/LM.7.1.202
Lara Gil-Vallejo, I. Castellón, Marta Coll-Florit, J. Turmo
{"title":"Hacia una clasificación verbal automática para el español: estudio sobre la relevancia de los diferentes tipos y configuraciones de información sintáctico-semántica","authors":"Lara Gil-Vallejo, I. Castellón, Marta Coll-Florit, J. Turmo","doi":"10.21814/LM.7.1.202","DOIUrl":"https://doi.org/10.21814/LM.7.1.202","url":null,"abstract":"En este trabajo nos centramos en la adquisicion de clasificaciones verbales automaticas para el espanol. Para ello realizamos una serie de experimentos con 20 sentidos verbales del corpus Sensem. Empleamos diferentes tipos de atributos que abarcan informacion linguistica diversa y un metodo de clustering jerarquico aglomerativo para generar varias clasificaciones. Comparamos cada una de estas clasificaciones automaticas con un gold standard creado semi-automaticamente teniendo en cuenta construcciones linguisticas propuestas desde la linguistica teorica. Esta comparacion nos permite saber que atributos son mas adecuados para crear de forma automatica una clasificacion coherente con la teoria sobre construcciones y cuales son las similitudes y diferencias entre la clasificacion verbal automatica y la que se basa en la teoria sobre construcciones linguisticas.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2015-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371463","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Geração de Linguagem Natural para Conversão de Dados em Texto - Aplicação a um Assistente de Medicação para o Português 生成用于数据转换为文本的自然语言-葡萄牙语药物助理的应用
IF 0.6
Linguamatica Pub Date : 2015-07-31 DOI: 10.21814/LM.7.1.206
J. C. Pereira, A. Teixeira
{"title":"Geração de Linguagem Natural para Conversão de Dados em Texto - Aplicação a um Assistente de Medicação para o Português","authors":"J. C. Pereira, A. Teixeira","doi":"10.21814/LM.7.1.206","DOIUrl":"https://doi.org/10.21814/LM.7.1.206","url":null,"abstract":"New equipments, such as smartphones and tablets, are changing human computer interaction. These devices present several challenges, especially due to their small screen and keyboard. In order to use text and voice in multimodal interaction, it is essential to deploy modules to translate the internal information of the applications into sentences or texts, in order to display it on screen or synthesize it. Also, these modules must generate phrases and texts in the user's native language; the development should not require considerable resources; and the outcome of the generation should achieve a good degree of variability. Our main objective is to propose, implement and evaluate a method of data conversion to Portuguese which can be developed with a minimum of time and knowledge, but without compromising the necessary variability and quality of what is generated. The developed system, for a Medication Assistant, is intended to create descriptions, in natural language, of medication to be taken. Motivated by recent results, we opted for an approach based on machine translation, with models trained on a small parallel corpus. For that, a new corpus was created. With it, two variants of the system were trained: phrase-based translation and syntax-based translation. The two variants were evaluated by automatic measurements -- BLEU and Meteor -- and by humans. The results showed that a phrase-based approach produced better results than a syntax-based one: human evaluators evaluated 60% of phrase-based responses as good, or very good, compared to only 46% of syntax-based responses. Considering the corpus size, we judge this value (60%) as good.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2015-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A arquitetura de um glossário terminológico Inglês-Português na área de Eletrotécnica 电气技术领域的英语-葡萄牙语术语表的架构
IF 0.6
Linguamatica Pub Date : 2015-07-31 DOI: 10.21814/LM.7.1.204
S. Fadanelli, M. J. B. Finatto
{"title":"A arquitetura de um glossário terminológico Inglês-Português na área de Eletrotécnica","authors":"S. Fadanelli, M. J. B. Finatto","doi":"10.21814/LM.7.1.204","DOIUrl":"https://doi.org/10.21814/LM.7.1.204","url":null,"abstract":"This article describes some of the procedures for the execution of an online English-Portuguese glossary prototype in Eletrical Engineering / Eletrotechnical Field terminology – aimed mainly at beginner students from technical and graduation courses in Electrical Engineering. The methodology is comprised of a corpus of datasheets, documents often used by professionals of the Electrical Engineering area, and the comparison of data obtained from these datasheets with the data gathered from 108 students of Electrical courses. Results point to the relevance of considering the point of view of our target audience to build the glossary properly.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2015-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Uma Comparação Sistemática de Diferentes Abordagens para a Sumarização Automática Extrativa de Textos em Português 葡萄牙语文本自动提取摘要不同方法的系统比较
IF 0.6
Linguamatica Pub Date : 2015-07-31 DOI: 10.21814/LM.7.1.203
M. Costa, Bruno Martins
{"title":"Uma Comparação Sistemática de Diferentes Abordagens para a Sumarização Automática Extrativa de Textos em Português","authors":"M. Costa, Bruno Martins","doi":"10.21814/LM.7.1.203","DOIUrl":"https://doi.org/10.21814/LM.7.1.203","url":null,"abstract":"Automatic document summarization is the task of automatically generating condensed versions of source texts, presenting itself as one of the fundamental problems in the areas of Information Retrieval and Natural Language Processing. In this paper, different extractive approaches are compared in the task of summarizing individual documents corresponding to journalistic texts written in Portuguese. Through the use of the ROUGE package for measuring the quality of the produced summaries, we report on results for two different experimental domains, involving (i) the generation of headlines for news articles written in European Portuguese, and (ii) the generation of summaries for news articles written in Brazilian Portuguese. The results demonstrate that methods based on the selection of the first sentences have the best results  when building extractive news headlines in terms of several ROUGE metrics. Regarding the generation of summaries with more than one sentence, the method that achieved the best results was the LSA Squared algorithm, for the various ROUGE metrics.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2015-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extração de Relações utilizando Features Diferenciadas para Português 葡萄牙语中使用不同特征的关系提取
IF 0.6
Linguamatica Pub Date : 2014-12-26 DOI: 10.21814/LM.6.2.182
Erick Nilsen Pereira de Souza, Daniela Barreio Claro
{"title":"Extração de Relações utilizando Features Diferenciadas para Português","authors":"Erick Nilsen Pereira de Souza, Daniela Barreio Claro","doi":"10.21814/LM.6.2.182","DOIUrl":"https://doi.org/10.21814/LM.6.2.182","url":null,"abstract":"Relation Extraction (RE) is a task of Information Extraction (IE) responsible for the discovery of semantic relationships between concepts in unstructured text. When the extraction is not limited to a predefined set of relations, the task is called Open Relation Extraction, whose main challenge is to reduce the proportion of invalid extractions in the universe of relationships identified. Current methods based on a set of specific machine learning features eliminate much of the invalid extractions. However, these solutions have the disadvantage of being highly language-dependent. This dependence arises from the difficulty in finding the most representative set of features to the Open RE problem, considering the peculiarities of each language. In this context, the present work proposes to assess the difficulties of classification based on features in open relation extraction in Portuguese, aiming to base new solutions that can reduce language dependence in this task. The results indicate that many representative features in English can not be mapped directly to the Portuguese language with satisfactory merits of classification. Among the classification algorithms evaluated, J48 showed the best results with a F-measure value of 84.1%, followed by SVM (83.9%), Perceptron (82.0%) and Naive Bayes (79,9%).","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68370924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Izen+aditz konbinazioen azterketa elebiduna, hizkuntza-aplikazio aurreratuei begira 名称+听力组合测试元素,检查高级语言应用程序
IF 0.6
Linguamatica Pub Date : 2014-12-26 DOI: 10.21814/LM.6.2.188
Uxoa Iñurrieta Urmeneta, I. Aduriz, A. D. D. Ilarraza, Gorka Labaka, K. Sarasola
{"title":"Izen+aditz konbinazioen azterketa elebiduna, hizkuntza-aplikazio aurreratuei begira","authors":"Uxoa Iñurrieta Urmeneta, I. Aduriz, A. D. D. Ilarraza, Gorka Labaka, K. Sarasola","doi":"10.21814/LM.6.2.188","DOIUrl":"https://doi.org/10.21814/LM.6.2.188","url":null,"abstract":"This article deals with noun+verb combinations in bilingual Basque-Spanish and Spanish-Basque dictionaries. We take a look at morphosyntactic and semantic features of word combinations in both language directions, and compare them to identify differences and similarities. Our work reveals the high complexity of those constructions and, hence, the need to address them specifically in Natural Language Processing tools, for example in Machine Translation. All of our results are publicly available online, where users can query the combinations we have analysed.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
O dicionario de sinónimos como recurso para a expansión de WordNet 同义词词典作为WordNet扩展的资源
IF 0.6
Linguamatica Pub Date : 2014-12-26 DOI: 10.21814/LM.6.2.183
Xavier Gómez Guinovart, Miguel Anxo Solla Portela
{"title":"O dicionario de sinónimos como recurso para a expansión de WordNet","authors":"Xavier Gómez Guinovart, Miguel Anxo Solla Portela","doi":"10.21814/LM.6.2.183","DOIUrl":"https://doi.org/10.21814/LM.6.2.183","url":null,"abstract":"In this paper, we present the foundations for a lexical acquisition experiment designed in the framework of the SKATeR research project and aimed to the expansion of the Galician WordNet using the lexicographical data collected in a ``traditional'' Galician dictionary of synonyms.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68370991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Projetos sobre Tradução Automática do Português no Laboratório de Sistemas de Língua Falada do INESC-ID INESC-ID口语系统实验室葡萄牙语机器翻译项目
IF 0.6
Linguamatica Pub Date : 2014-12-26 DOI: 10.21814/LM.6.2.196
Anabela Barreiro, Wang Ling, Luísa Coheur, Fernando Batista, Isabel Trancoso
{"title":"Projetos sobre Tradução Automática do Português no Laboratório de Sistemas de Língua Falada do INESC-ID","authors":"Anabela Barreiro, Wang Ling, Luísa Coheur, Fernando Batista, Isabel Trancoso","doi":"10.21814/LM.6.2.196","DOIUrl":"https://doi.org/10.21814/LM.6.2.196","url":null,"abstract":"Language technologies, in particular machine translation applications, have the potential to help break down linguistic and cultural barriers, presenting an important contribution to the globalization and internationalization of the Portuguese language, by allowing content to be shared 'from' and 'to' this language. This article aims to present the research work developed at the Laboratory of Spoken Language Systems of INESC-ID in the field of machine translation, namely the automated speech translation, the translation of microblogs and the creation of a hybrid machine translation system. We will focus on the creation of the hybrid system, which aims at combining linguistic knowledge, in particular semantico-syntactic knowledge, with statistical knowledge, to increase the level of translation quality.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371348","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Euskarazko denbora-egiturak. Azterketa eta etiketatze-esperimentua 媒体时间结构。检验与标签实验
IF 0.6
Linguamatica Pub Date : 2014-12-25 DOI: 10.21814/LM.6.2.184
Begoña Altuna, M.ª Jesús Aranzabe, A. D. D. Ilarraza
{"title":"Euskarazko denbora-egiturak. Azterketa eta etiketatze-esperimentua","authors":"Begoña Altuna, M.ª Jesús Aranzabe, A. D. D. Ilarraza","doi":"10.21814/LM.6.2.184","DOIUrl":"https://doi.org/10.21814/LM.6.2.184","url":null,"abstract":"Time information extraction is very useful in natural language processing (NLP), as it can be used in text simplification, information extraction and machine translation systems. In this paper we present the first steps of making that information accessible for Basque language: on one hand, Basque structures that convey time have been analysed based on grammars and, on the other hand, first decisions on tagging those on real texts have been taken. Also, we give account of an annotating experiment we have carried out on a financial news corpus.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371046","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Avaliação de métodos de desofuscação de palavrões 评价消除脏话混淆的方法
IF 0.6
Linguamatica Pub Date : 2014-12-25 DOI: 10.21814/LM.6.2.191
Gustavo Laboreiro, E. Oliveira
{"title":"Avaliação de métodos de desofuscação de palavrões","authors":"Gustavo Laboreiro, E. Oliveira","doi":"10.21814/LM.6.2.191","DOIUrl":"https://doi.org/10.21814/LM.6.2.191","url":null,"abstract":"Cursing is a form of expression that is noted by its intensity. When someone uses this form of expression they are emitting a spontaneous and raw form of opinion, usually suppressed for the ``mild ways'' and sensitive people. As it happens, this sort of expression is also valuable when doing some sort of opinion mining and sentiment analysis, now a routine task across the social networks. Therefore in this work we try to evaluate the methods that allow the recovery of this forms of expression, disguised through obfuscation methods, often as a way to escape automatic censorship.","PeriodicalId":41819,"journal":{"name":"Linguamatica","volume":null,"pages":null},"PeriodicalIF":0.6,"publicationDate":"2014-12-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"68371242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信