Cataloga: A Software for Semantic-Based Terminological Data Mining

A. Elia, Mario Monteleone, Alberto Postiglione
{"title":"Cataloga: A Software for Semantic-Based Terminological Data Mining","authors":"A. Elia, Mario Monteleone, Alberto Postiglione","doi":"10.1109/CCP.2011.42","DOIUrl":null,"url":null,"abstract":"This paper is focused on Catalog a, a software package based on Lexicon-Grammar theoretical and practical analytical framework and embedding a ling ware module built on compressed terminological electronic dictionaries. We will here show how Catalog a can be used to achieve efficient data mining and information retrieval by means of lexical ontology associated to terminology-based automatic textual analysis. Also, we will show how accurate data compression is necessary to build efficient textual analysis software. Therefore, we will here discuss the creation and functioning of a software for semantic-based terminological data mining, in which a crucial role is played by Italian simple and compound-word electronic dictionaries. Lexicon-Grammar is one of the most profitable and consistent methods for natural language formalization and automatic textual analysis it was set up by French linguist Maurice Gross during the '60s, and subsequently developed for and applied to Italian by Annibale Elia, Emilio D'Agostino and Maurizio Martin Elli. Basically, Lexicon-Grammar establishes morph syntactic and statistical sets of analytic rules to read and parse large textual corpora. The analytical procedure here described will prove itself appropriate for any type of digitalized text, and will represent a relevant support for the building and implementing of Semantic Web (SW) interactive platforms.","PeriodicalId":167131,"journal":{"name":"2011 First International Conference on Data Compression, Communications and Processing","volume":"97 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 First International Conference on Data Compression, Communications and Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCP.2011.42","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

This paper is focused on Catalog a, a software package based on Lexicon-Grammar theoretical and practical analytical framework and embedding a ling ware module built on compressed terminological electronic dictionaries. We will here show how Catalog a can be used to achieve efficient data mining and information retrieval by means of lexical ontology associated to terminology-based automatic textual analysis. Also, we will show how accurate data compression is necessary to build efficient textual analysis software. Therefore, we will here discuss the creation and functioning of a software for semantic-based terminological data mining, in which a crucial role is played by Italian simple and compound-word electronic dictionaries. Lexicon-Grammar is one of the most profitable and consistent methods for natural language formalization and automatic textual analysis it was set up by French linguist Maurice Gross during the '60s, and subsequently developed for and applied to Italian by Annibale Elia, Emilio D'Agostino and Maurizio Martin Elli. Basically, Lexicon-Grammar establishes morph syntactic and statistical sets of analytic rules to read and parse large textual corpora. The analytical procedure here described will prove itself appropriate for any type of digitalized text, and will represent a relevant support for the building and implementing of Semantic Web (SW) interactive platforms.
Cataloga:基于语义的术语数据挖掘软件
目录a是一个基于词典语法理论和实践分析框架,嵌入一个基于压缩术语电子词典的语言模块的软件包。我们将在这里展示如何使用Catalog a通过与基于术语的自动文本分析相关联的词汇本体来实现有效的数据挖掘和信息检索。此外,我们将展示准确的数据压缩对于构建高效的文本分析软件是多么必要。因此,我们将在这里讨论基于语义的术语数据挖掘软件的创建和功能,其中意大利语简单词和复合词电子词典起着至关重要的作用。Lexicon-Grammar是自然语言形式化和自动文本分析中最有效和最一致的方法之一,它是由法国语言学家Maurice Gross在60年代建立的,随后由Annibale Elia, Emilio D' agostino和Maurizio Martin Elli发展并应用于意大利语。基本上,Lexicon-Grammar建立了分析规则的词形、句法和统计集,以阅读和解析大型文本语料库。这里描述的分析过程将证明自己适用于任何类型的数字化文本,并将代表对语义网(SW)交互平台的构建和实现的相关支持。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信