Text Extraction and Clustering for Multimedia: A review on Techniques and Challenges

2019 International Conference on Digitization (ICD) Pub Date : 2019-11-01 DOI:10.1109/ICD47981.2019.9105905

Zaheeruddin Ahmed, Harvir Singh

引用次数: 1

Abstract

The internet technologies have developed rapidly over recent times producing massive sets of multimedia data where text, images, audios, and videos delivering a huge set of content. The text ingrained in this multimedia is generated from the web and social media that carry complex and meaningful data. There is an increasing need to recognize and extract text from multimedia data which is in unstructured form. Many new techniques have been applied to address the need for text extraction for multimedia but not all have been efficient. One important application of text analysis is to extract text information and then recognize meaningful data visualization for better decisions. This paper focuses on addressing significant text extraction and clustering structures, techniques and challenges from multimedia data set. We will highlight different approaches to text extraction and clustering from multimedia content.

查看原文本刊更多论文

多媒体文本提取与聚类:技术综述与挑战

近年来，互联网技术发展迅速，产生了大量的多媒体数据集，其中文本、图像、音频和视频提供了大量的内容。这种多媒体中根深蒂固的文本来自网络和社交媒体，它们承载着复杂而有意义的数据。从非结构化形式的多媒体数据中识别和提取文本的需求日益增长。为了满足多媒体文本提取的需要，已经应用了许多新技术，但并非所有技术都是有效的。文本分析的一个重要应用是提取文本信息，然后识别有意义的数据可视化，以便做出更好的决策。本文重点讨论了多媒体数据集中重要的文本提取和聚类结构、技术和挑战。我们将重点介绍从多媒体内容中提取文本和聚类的不同方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on Digitization (ICD)

自引率

0.00%

发文量