Towards Intelligent Web Context-Based Content On-Demand Extraction Using Deep Learning

Mina A. Melek, B. Mokhtar
{"title":"Towards Intelligent Web Context-Based Content On-Demand Extraction Using Deep Learning","authors":"Mina A. Melek, B. Mokhtar","doi":"10.1109/GCAIoT51063.2020.9345816","DOIUrl":null,"url":null,"abstract":"Information extraction and reasoning from massive high-dimensional data at dynamic contexts, is very demanding and yet is very hard to obtain in real-time basis. However, such process capability and efficiency might be affected and limited by the available computational resources and the consequent power consumption. Conventional search mechanisms are often incapable of real-time fetching a predefined content from data source, without concerning the increased number of connected devices that contribute to the same source. In this work, we propose and present a concept for an efficient approach for online content searching, takes advantage of a) the structure of data profiling employed at the related data source; and b) the learning algorithms that are used for extracting its common features and for generating a map of indices to data contents. This enables instant mapping of users requests to make the process as realtime as possible. The adopted learning algorithms main blocks are built to capture the semantic features in the targeted context of data sentences. We reviewed several learning approaches and compared their results based on the criteria of capturing the semantic features that appeared through the preliminary results. The preliminary results conclusively confirmed that employing the weighted recurrent neural networks and the GloVE pre-trained model paired with NMF topic modeling, yielded highly acceptable levels of Fl-score and prediction time.","PeriodicalId":398815,"journal":{"name":"2020 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE Global Conference on Artificial Intelligence and Internet of Things (GCAIoT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GCAIoT51063.2020.9345816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Information extraction and reasoning from massive high-dimensional data at dynamic contexts, is very demanding and yet is very hard to obtain in real-time basis. However, such process capability and efficiency might be affected and limited by the available computational resources and the consequent power consumption. Conventional search mechanisms are often incapable of real-time fetching a predefined content from data source, without concerning the increased number of connected devices that contribute to the same source. In this work, we propose and present a concept for an efficient approach for online content searching, takes advantage of a) the structure of data profiling employed at the related data source; and b) the learning algorithms that are used for extracting its common features and for generating a map of indices to data contents. This enables instant mapping of users requests to make the process as realtime as possible. The adopted learning algorithms main blocks are built to capture the semantic features in the targeted context of data sentences. We reviewed several learning approaches and compared their results based on the criteria of capturing the semantic features that appeared through the preliminary results. The preliminary results conclusively confirmed that employing the weighted recurrent neural networks and the GloVE pre-trained model paired with NMF topic modeling, yielded highly acceptable levels of Fl-score and prediction time.
基于深度学习的智能Web内容按需提取
动态环境下海量高维数据的信息提取和推理要求很高,但又很难实时获得。然而,这种处理能力和效率可能会受到可用计算资源和随之而来的功耗的影响和限制。传统的搜索机制通常无法从数据源实时获取预定义的内容,而不考虑连接到同一数据源的设备数量的增加。在这项工作中,我们提出并提出了一种有效的在线内容搜索方法的概念,利用了a)在相关数据源上使用的数据分析结构;b)用于提取其共同特征和生成数据内容索引映射的学习算法。这允许对用户请求进行即时映射,从而使流程尽可能实时。所采用的学习算法主要构建块来捕获数据句子目标上下文中的语义特征。我们回顾了几种学习方法,并基于捕获初步结果中出现的语义特征的标准对其结果进行了比较。初步结果最终证实,采用加权递归神经网络和GloVE预训练模型与NMF主题建模相结合,产生了高度可接受的fl得分和预测时间水平。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信