{"title":"Ciphertexts Clustering is Equivalent to Plaintexts Clustering","authors":"W. A. R. Souza, L. A. V. Carvalho, J. A. Xexéo","doi":"10.1109/STIL.2009.21","DOIUrl":"https://doi.org/10.1109/STIL.2009.21","url":null,"abstract":"Several studies have been made in attempt to break confidentiality, either by obtaining the knowledge of the plaintext or the key itself working only with cryptograms. However, there is not known methods capable of breaking contemporary cryptographic algorithms, as DES and AES. Nevertheless, in order to benefit cryptanalysts, it is possible to search weakness in these algorithms. In this work we show that ciphertexts can be considered as plaintexts written in an unknown idiom and using a binary alphabet, where each idiom is determined by the cryptographic key. In the experiments with ciphertexts and plaintexts clustering it have reached success, since all ciphertexts encrypted with the same key belong to the same group, as well as, plaintexts, written in the same idiom and alphabet belong to the same group. This result exposes a cryptographic algorithms weakness, since they are designed to generate ciphertexts without any relation with the input data, such as the plaintext or the cryptographic key.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132345112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Isidra Ocampo-Guzman, I. Lopez-Arevalo, E. Tello-Leal, V. Sosa-Sosa
{"title":"Towards the Automatic Learning of Ontologies","authors":"Isidra Ocampo-Guzman, I. Lopez-Arevalo, E. Tello-Leal, V. Sosa-Sosa","doi":"10.1109/STIL.2009.23","DOIUrl":"https://doi.org/10.1109/STIL.2009.23","url":null,"abstract":"This paper proposes a methodology for the automatic learning of ontologies from a text corpus. The concepts (topics) from documents into the corpus are identified by using the Latent Dirichlet Allocation model. Based on theset of identified topics, for each concept it is constructed its taxonomy by using the terms with greater probability which contribute to define it. WordNet is usedin the construction of these partial topic taxonomies by obtaining the similarity and relatedness between the terms that constitute each topic. The resulting taxonomies are joined to structure the final ontology. The methodology is evaluated with the Lonely Planet corpus.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"99 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131684999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Symball Rufino de Oliveira, Marisa Bräscher Basílio Medeiros
{"title":"Legal Information Retrieval: Evaluating Case-Based Reasoning","authors":"Symball Rufino de Oliveira, Marisa Bräscher Basílio Medeiros","doi":"10.1109/STIL.2009.35","DOIUrl":"https://doi.org/10.1109/STIL.2009.35","url":null,"abstract":"This is a research whose object of study is to evaluate a legal Information Retrieval system precision. This information retrieval system is based on a model that uses artificial intelligence technique known as Case-Based Reasoning (CBR). The principle of CBR is that a past legal case can be useful to solve a current problem, since there is between them some degree of similarity. This research uses jurisprudences produced by the Regional Electoral Tribunal of the Distrito Federal. The precision degree was evaluated from the result of a set of queries submitted to it. The method adopted for the evaluation was the same used in the Text REtrieval Conference in 2007 by Legal Track Task.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130353966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Lavinia: An Environment for Natural Language Processing","authors":"Cecilia Techera, Diego Garat, Guillermo Moncecchi","doi":"10.1109/STIL.2009.9","DOIUrl":"https://doi.org/10.1109/STIL.2009.9","url":null,"abstract":"En este artículo presentamos Lavinia, un ambiente para Procesamiento de Lenguaje Natural (PLN), en donde desarrolladores y usuarios pueden integrar y compartir componentes construidos en la plataforma UIMA. Laviniaintroduce un algoritmo para visualizar los resultados independiente del proceso que los generó, permitiendo además al usuario modificar en forma dinámica la forma en que se muestran, buscando destacar aquellos aspectos del análisis que sean de su interés.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115415325","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Construction of a Domain Ontological Structure from Wikipedia","authors":"C. Xavier, Vera Lúcia Strube de Lima","doi":"10.1109/STIL.2009.26","DOIUrl":"https://doi.org/10.1109/STIL.2009.26","url":null,"abstract":"Data extraction from Wikipedia for ontologies construction,enrichment and population is an emerging research field. This paper describes a study on automatic extraction of an ontological structure containing hyponymy and location relations from Wikipedia's Tourism category in Portuguese, illustrated with an experiment, and evaluation of its results.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132812328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aline Villavicencio, Helena de Medeiros Caseli, A. Machado
{"title":"Identification of Multiword Expressions in Technical Domains: Investigating Statistical and Alignment-Based Approaches","authors":"Aline Villavicencio, Helena de Medeiros Caseli, A. Machado","doi":"10.1109/STIL.2009.33","DOIUrl":"https://doi.org/10.1109/STIL.2009.33","url":null,"abstract":"Multiword Expressions (MWEs) are one of the stumbling blocks for more precise Natural Language Processing (NLP) systems. The lack of coverage of MWEs in resources can impact negatively on the performance of tasks and applications, and can lead to loss of information or communication errors; especially in technical domains where MWE are frequent. This paper investigates some approaches to the identification of MWEs in technical corpora based on: association measures, part-of-speech and lexical alignment information. We examine the influence of some factors on their performance such as sources of information for identification and evaluation. While the association measures emphasize recall, the alignment method focuses on precision.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121052037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
J. Morais, Nelson Neto, Aldebaro Barreto da Rocha Klautau Jr
{"title":"Technologies for the Development of Spoken Dialog Systems in Brazilian Portuguese","authors":"J. Morais, Nelson Neto, Aldebaro Barreto da Rocha Klautau Jr","doi":"10.1109/STIL.2009.20","DOIUrl":"https://doi.org/10.1109/STIL.2009.20","url":null,"abstract":"This work discusses the integration of available technologies for developing spoken dialog systems in Brazilian Portuguese. As a proof-of-concept, it describes a system for non-visual and on-line Web search on Windows.The prototype system is based on Microsoft's Speech Application Programming Interface (SAPI), which provides an interface that allows the establishment of a dialog, where the system asks the site and query word. The system then reads aloud the page contents. The system itself coordinates the interaction with the user and is currently limited to query by the name of countries.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126662416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Computational Linguistic Systematization of a Lexicon from the “Ibitinga embroidery industry” Conceptual Domain","authors":"E. Marcellino, Bento Carlos Dias da Silva","doi":"10.1109/STIL.2009.34","DOIUrl":"https://doi.org/10.1109/STIL.2009.34","url":null,"abstract":"This work discusses a proposition for organizing the lexical itemsfrom the conceptual domain labeled THE EMBROIDERY INDUSTRY OFIBITINGA in terms of a natural ontology. It also aims to establish thealignment between this ontology and the bases WordNet.Pr and WordNet.Br.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"73 1‐2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"113953551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Text Readability Analysis with Natural Language Processing Tools: The Adaptation of Coh-Metrix Metrics for Portuguese","authors":"D. Almeida, S. Aluísio, S. M. Aluix0301sio","doi":"10.1109/STIL.2009.13","DOIUrl":"https://doi.org/10.1109/STIL.2009.13","url":null,"abstract":"This paper presents the adaptation of Coh-Metrix metrics for the Brazilian Portuguese language (Coh-Metrix-Port). It describes the analysis of natural language processing tools for Portuguese, the decisions taken for the creation of Coh-Metrix-Port, and a case study of the application of Coh-Metrix-Port in the analysis of original and simple accounts, i.e. texts composed in a way that the writer recasts the information from a source to suit a particular kind of reader, for kids. This tool can help assessing whether text available on the Web are suitable for functional illiterates and people with other cognitive disabilities, such as, dyslexia and aphasia, and also for children and adults learning to read and thus allowing the access of Web texts for a wider range of users.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132675992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
V. Pinheiro, T. Pequeno, Vasco Furtado, Douglas Nogueira
{"title":"Semantic Inferentialist Analyser: A Semantic Analyser for Natural Language Sentences","authors":"V. Pinheiro, T. Pequeno, Vasco Furtado, Douglas Nogueira","doi":"10.1109/STIL.2009.14","DOIUrl":"https://doi.org/10.1109/STIL.2009.14","url":null,"abstract":"This paper describes the Semantic Inferentialist Analyzer (SIA), which implements an algorithm for semantic analysis of sentences that reasons on the inferential content of concepts and sentences patterns. The inferential semantic relatedness measure and the reasoning process of the SIA are described. Finally, an application of the SIA in an information extraction system about crimes - WikiCrimesIE - and the experimental results are discussed.","PeriodicalId":265848,"journal":{"name":"2009 Seventh Brazilian Symposium in Information and Human Language Technology","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130787520","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}