企业环境中非结构化文档向结构化信息转换的语义处理

Adam Bartusiak, Jörg Lässig
{"title":"企业环境中非结构化文档向结构化信息转换的语义处理","authors":"Adam Bartusiak, Jörg Lässig","doi":"10.1145/2993318.2993341","DOIUrl":null,"url":null,"abstract":"We present an on-going research project addressing the problem of massive amounts of unstructured data that is generated on a daily basis in most business organisations, regardless of size. Our motivation is to support in particular small and medium seized enterprises to gain a competitive advantage in the market. The goal is to improve their processes for extracting valuable business information from such disorganised data. To achieve this, we introduce a flexible and scalable data analysis framework capable of transforming various types of documents into semantically annotated structures. This includes emails, text files in various formats, slide presentations, blog entries, etc. Additionally, the solution provides a semantic search engine for structured retrieval of the analyzed information and a graphical layer to dynamically visualize the search results as an interactive graph. Throughout the paper, the architecture of two main engines that are responsible for data and text analysis and semantic search are described. We conclude that semantic processing of unstructured sources significantly improves data management and data integration within the enterprises.","PeriodicalId":177013,"journal":{"name":"Proceedings of the 12th International Conference on Semantic Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context\",\"authors\":\"Adam Bartusiak, Jörg Lässig\",\"doi\":\"10.1145/2993318.2993341\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We present an on-going research project addressing the problem of massive amounts of unstructured data that is generated on a daily basis in most business organisations, regardless of size. Our motivation is to support in particular small and medium seized enterprises to gain a competitive advantage in the market. The goal is to improve their processes for extracting valuable business information from such disorganised data. To achieve this, we introduce a flexible and scalable data analysis framework capable of transforming various types of documents into semantically annotated structures. This includes emails, text files in various formats, slide presentations, blog entries, etc. Additionally, the solution provides a semantic search engine for structured retrieval of the analyzed information and a graphical layer to dynamically visualize the search results as an interactive graph. Throughout the paper, the architecture of two main engines that are responsible for data and text analysis and semantic search are described. We conclude that semantic processing of unstructured sources significantly improves data management and data integration within the enterprises.\",\"PeriodicalId\":177013,\"journal\":{\"name\":\"Proceedings of the 12th International Conference on Semantic Systems\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 12th International Conference on Semantic Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2993318.2993341\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 12th International Conference on Semantic Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2993318.2993341","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

我们提出了一个正在进行的研究项目,解决了大多数商业组织(无论规模大小)每天产生的大量非结构化数据的问题。我们的动机是协助中小型被查获货品的企业在市场上取得竞争优势。目标是改进他们从这些杂乱无章的数据中提取有价值的业务信息的流程。为了实现这一点,我们引入了一个灵活的、可扩展的数据分析框架,能够将各种类型的文档转换为带有语义注释的结构。这包括电子邮件、各种格式的文本文件、幻灯片演示文稿、博客条目等。此外,该解决方案还提供了一个语义搜索引擎,用于结构化地检索所分析的信息,并提供了一个图形层,将搜索结果动态地可视化为交互式图形。在整个论文中,描述了负责数据和文本分析以及语义搜索的两个主要引擎的体系结构。我们得出结论,非结构化数据源的语义处理显著改善了企业内部的数据管理和数据集成。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Semantic Processing for the Conversion of Unstructured Documents into Structured Information in the Enterprise Context
We present an on-going research project addressing the problem of massive amounts of unstructured data that is generated on a daily basis in most business organisations, regardless of size. Our motivation is to support in particular small and medium seized enterprises to gain a competitive advantage in the market. The goal is to improve their processes for extracting valuable business information from such disorganised data. To achieve this, we introduce a flexible and scalable data analysis framework capable of transforming various types of documents into semantically annotated structures. This includes emails, text files in various formats, slide presentations, blog entries, etc. Additionally, the solution provides a semantic search engine for structured retrieval of the analyzed information and a graphical layer to dynamically visualize the search results as an interactive graph. Throughout the paper, the architecture of two main engines that are responsible for data and text analysis and semantic search are described. We conclude that semantic processing of unstructured sources significantly improves data management and data integration within the enterprises.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信