2009 Seventh Brazilian Symposium in Information and Human Language Technology最新文献

筛选
英文 中文
Content Selection Operators for Multidocument Summarization Based on Cross-Document Structure Theory 基于跨文档结构理论的多文档摘要内容选择算子
M. L. C. Jorge, T. Pardo
{"title":"Content Selection Operators for Multidocument Summarization Based on Cross-Document Structure Theory","authors":"M. L. C. Jorge, T. Pardo","doi":"10.1109/STIL.2009.15","DOIUrl":"https://doi.org/10.1109/STIL.2009.15","url":null,"abstract":"This paper aims at presenting an analysis of content selection techniques for multidocument summarization based on the multidocument discourse theory CST (Cross-document Structure Theory). We approach the task of content selection by using CST-based operators and focus specifically on redundancy treatment, which is an important and pervasive problem in multidocument summarization. Our experiments with Brazilian Portuguese news texts show that CST improves summaries quality by exploring relations among texts. Particularly, redundancy is reduced by identifying common information among texts, especially when compression rate is low.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134181903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic Fusion of Similar Sentences in Portuguese 葡萄牙语相似句的自动融合
E. M. Seno, M. G. V. Nunes
{"title":"Automatic Fusion of Similar Sentences in Portuguese","authors":"E. M. Seno, M. G. V. Nunes","doi":"10.1109/STIL.2009.27","DOIUrl":"https://doi.org/10.1109/STIL.2009.27","url":null,"abstract":"This paper presents a Portuguese sentence fusion model. Sentence fusion is a text-to-text generation task which takes a set of similar sentences as input and combines these into a single output sentence. This process is of extreme relevance in many NLP applications, for instance, to treat redundancies in Multidocument Summarization by fusing information from a set of related sentences into a new one. We present three intrinsic evaluations of the model and the results obtained suggest that it has potential.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133611349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Portuguese Temporal Expressions Recognition: From TE Characterization to an Effective TER Module Implementation 葡萄牙语时态表达式识别:从TE表征到有效的TER模块实现
Caroline Hagège, J. Baptista, N. Mamede
{"title":"Portuguese Temporal Expressions Recognition: From TE Characterization to an Effective TER Module Implementation","authors":"Caroline Hagège, J. Baptista, N. Mamede","doi":"10.1109/STIL.2009.12","DOIUrl":"https://doi.org/10.1109/STIL.2009.12","url":null,"abstract":"Taking into account the temporal dimension conveyed in texts is a challenge to natural language processing. At the same time this task is of great importance for a wide range of natural language processing applications. The goal of this paper is twofold. First a characterization of Portuguese temporal expressions as they appear in texts is presented. This classification is intended to meet the requirements of high inter-agreement between annotators of temporal expressions. Second, relying on this characterization, an effective temporal expression annotation tool is described. Results from its evaluation are reported.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123639904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Clause Identification Using Entropy Guided Transformation Learning 基于熵引导变换学习的子句识别
Eraldo Rezende Fernandes, B. '. Pires, C. D. Santos, R. Milidiú
{"title":"Clause Identification Using Entropy Guided Transformation Learning","authors":"Eraldo Rezende Fernandes, B. '. Pires, C. D. Santos, R. Milidiú","doi":"10.1109/STIL.2009.10","DOIUrl":"https://doi.org/10.1109/STIL.2009.10","url":null,"abstract":"Entropy Guided Transformation Learning (ETL) is a machine learning strategy that extends Transformation Based Learning by providing automatic template generation. In this work, we propose an ETL approach to the clause identification task. We use the English language corpus of the CoNLL'2001 shared task. The achieved performance is not competitive yet, since the F1 of the ETL based system is 80.55, whereas the state-of-the-art system performance is 85.03. Nevertheless, our modeling strategy is very simple, when compared to the state-of-the-art approaches. These first findings indicate that the ETL approach is a promising one for this task. One can enhance its performance by incorporating problem specific knowledge. Additional features can be easily introduced in the ETL model.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122519706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Fine-Tuning in Portuguese-English Statistical Machine Translation 葡英统计机器翻译的微调
Wilker Aziz, T. Pardo, Ivandré Paraboni
{"title":"Fine-Tuning in Portuguese-English Statistical Machine Translation","authors":"Wilker Aziz, T. Pardo, Ivandré Paraboni","doi":"10.1109/STIL.2009.16","DOIUrl":"https://doi.org/10.1109/STIL.2009.16","url":null,"abstract":"In previous work we have shown results of a first experiment in Statistical Machine Translation (SMT) for Brazilian Portuguese and American English using state-of-the-art phrase-based models. In this paper we compare a number of training and decoding parameter choices for fine-tuning the system as an attempt to obtain optimal results for this language pair.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125882159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Semantic Relation Extraction by Analysis of Terms Correlation in Documents 基于词相关性分析的文档语义关系提取
Sérgio William Botero, I. Ricarte
{"title":"Semantic Relation Extraction by Analysis of Terms Correlation in Documents","authors":"Sérgio William Botero, I. Ricarte","doi":"10.1109/STIL.2009.18","DOIUrl":"https://doi.org/10.1109/STIL.2009.18","url":null,"abstract":"Ontologies are important to organize and describe information, but are hard to create and maintain, which motivates the development of tools to help in this task. This article presents a strategy to extract, from a corpora of documents in a given domain, semantic elements expressing proximity relations between terms and concepts to help the construction of domain ontologies. The technique presented here, ACT, is based on linguistic processing, machine learning, and biclustering. Results show that concepts obtained by ACT are at least as good as those from similar techniques, such as LSI and NMF. In relation to those techniques, it additionally has the advantage of allowing the supervision by a domain expert.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132782411","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Factorial to Quadratic Time Complexity for Sentence Realization Using Nearest Neighbour Algorithm 基于最近邻算法的句子实现从阶乘到二次时间复杂度
Karthik Gali, Sriram Venkatapathy, Taraka Rama
{"title":"From Factorial to Quadratic Time Complexity for Sentence Realization Using Nearest Neighbour Algorithm","authors":"Karthik Gali, Sriram Venkatapathy, Taraka Rama","doi":"10.1109/STIL.2009.38","DOIUrl":"https://doi.org/10.1109/STIL.2009.38","url":null,"abstract":"{karthikg@students,sriram@research,taraka@students}.iiit.ac.in Abstract. Sentence Realization is the task of generating a well-formed sentence from a bag of words. Sentence Realization is a major step in many Natural Language Processing applications like Machine Translation (MT), Summariza- tion and Dialogue Systems. In this paper, we explore a graph based Nearest Neighbour Algorithm for the task of Sentence Realization. Sentence Realization is a major step in many Natural Language Processing applications like Machine Translation (MT), Summarization and Dialogue Systems. The task of Sen- tence Realization involves formation of a well-formed sentence from a bag of lexical items. These lexical items may be attached syntactically with one another. The level of syntactic information varies from application to application. Our aim consists of achiev- ing quality sentence realiser using as much as minimum syntactic information and of minimal computational complexity. As such our experiments assume only basic syntactic information, such as unlabeled dependency relationships between the lexical items. Graph based algorithms for Natural Language applications such as Pars- ing (McDonald et al. 2005), Summarization (Mihalcea and Tarau 2005) and Word sense disambiguation (Mihalcea 2005) have been well explored. For the task of Sentence Re- alization, graph based algorithms have yet to be explored. This paper is a novel effort in that direction.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134387608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SAHARA: An Online Service for HAREM Named Entity Recognition Evaluation 撒哈拉:一个用于后宫命名实体识别评估的在线服务
Hugo Gonçalo Oliveira, Nuno Cardoso
{"title":"SAHARA: An Online Service for HAREM Named Entity Recognition Evaluation","authors":"Hugo Gonçalo Oliveira, Nuno Cardoso","doi":"10.1109/STIL.2009.31","DOIUrl":"https://doi.org/10.1109/STIL.2009.31","url":null,"abstract":"This paper presents SAHARA, an online service for the evaluation platform of Second HAREM. SAHARA allows a fast evaluation of any NER system that conforms with HAREM guidelines, making it easier to perform post-hoc evaluations and keep track of the overall performance of NER systems.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117094318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A Testbed for Portuguese Natural Language Generation 葡萄牙语自然语言生成的测试平台
E. M. D. Novais, Rafael L. de Oliveira, D. B. Pereira, Thiago Dias Tadeu, Ivandré Paraboni
{"title":"A Testbed for Portuguese Natural Language Generation","authors":"E. M. D. Novais, Rafael L. de Oliveira, D. B. Pereira, Thiago Dias Tadeu, Ivandré Paraboni","doi":"10.1109/STIL.2009.17","DOIUrl":"https://doi.org/10.1109/STIL.2009.17","url":null,"abstract":"We present a data-text aligned corpus for Brazilian Portuguese Natural Language Generation (NLG) called SINotas, which we believe to be the first of its kind. SINotas provides a testbed for research on various aspects of trainable, corpus-based NLG, and it is the basis of a simple NLG application under development in the education domain.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125062204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Evaluating the Performance of a Centroid-Based Probabilistic Neural Network 基于质心的概率神经网络性能评估
P. M. Ciarelli, E. Oliveira
{"title":"Evaluating the Performance of a Centroid-Based Probabilistic Neural Network","authors":"P. M. Ciarelli, E. Oliveira","doi":"10.1109/STIL.2009.32","DOIUrl":"https://doi.org/10.1109/STIL.2009.32","url":null,"abstract":"In this article is proposed a technique which uses centroids together with Probabilistic Neural Network to minimize some disadvantages of this net, such as the storage space for the neural network weights and linear time complexity order with the number of training samples. In the experiments carry out the memory usage and classification time were drastically reduced. Besides, the quality of the results was also considering improved by the a priory probability, when using it with theses centroids.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114922805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信