Multimodal Web Page Segmentation Using Self-organized Multi-objective Clustering

ACM Transactions on Information Systems (TOIS) Pub Date : 2022-03-07 DOI:10.1145/3480966

Srivatsa Ramesh Jayashree, G. Dias, J. Andrew, S. Saha, Fabrice Maurel, S. Ferrari

{"title":"Multimodal Web Page Segmentation Using Self-organized Multi-objective Clustering","authors":"Srivatsa Ramesh Jayashree, G. Dias, J. Andrew, S. Saha, Fabrice Maurel, S. Ferrari","doi":"10.1145/3480966","DOIUrl":null,"url":null,"abstract":"Web page segmentation (WPS) aims to break a web page into different segments with coherent intra- and inter-semantics. By evidencing the morpho-dispositional semantics of a web page, WPS has traditionally been used to demarcate informative from non-informative content, but it has also evidenced its key role within the context of non-linear access to web information for visually impaired people. For that purpose, a great deal of ad hoc solutions have been proposed that rely on visual, logical, and/or text cues. However, such methodologies highly depend on manually tuned heuristics and are parameter-dependent. To overcome these drawbacks, principled frameworks have been proposed that provide the theoretical bases to achieve optimal solutions. However, existing methodologies only combine few discriminant features and do not define strategies to automatically select the optimal number of segments. In this article, we present a multi-objective clustering technique called MCS that relies on \\( K \\) -means, in which (1) visual, logical, and text cues are all combined in a early fusion manner and (2) an evolutionary process automatically discovers the optimal number of clusters (segments) as well as the correct positioning of seeds. As such, our proposal is parameter-free, combines many different modalities, does not depend on manually tuned heuristics, and can be run on any web page without any constraint. An exhaustive evaluation over two different tasks, where (1) the number of segments must be discovered or (2) the number of clusters is fixed with respect to the task at hand, shows that MCS drastically improves over most competitive and up-to-date algorithms for a wide variety of external and internal validation indices. In particular, results clearly evidence the impact of the visual and logical modalities towards segmentation performance.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"20 1","pages":"1 - 49"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems (TOIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3480966","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Web page segmentation (WPS) aims to break a web page into different segments with coherent intra- and inter-semantics. By evidencing the morpho-dispositional semantics of a web page, WPS has traditionally been used to demarcate informative from non-informative content, but it has also evidenced its key role within the context of non-linear access to web information for visually impaired people. For that purpose, a great deal of ad hoc solutions have been proposed that rely on visual, logical, and/or text cues. However, such methodologies highly depend on manually tuned heuristics and are parameter-dependent. To overcome these drawbacks, principled frameworks have been proposed that provide the theoretical bases to achieve optimal solutions. However, existing methodologies only combine few discriminant features and do not define strategies to automatically select the optimal number of segments. In this article, we present a multi-objective clustering technique called MCS that relies on \( K \) -means, in which (1) visual, logical, and text cues are all combined in a early fusion manner and (2) an evolutionary process automatically discovers the optimal number of clusters (segments) as well as the correct positioning of seeds. As such, our proposal is parameter-free, combines many different modalities, does not depend on manually tuned heuristics, and can be run on any web page without any constraint. An exhaustive evaluation over two different tasks, where (1) the number of segments must be discovered or (2) the number of clusters is fixed with respect to the task at hand, shows that MCS drastically improves over most competitive and up-to-date algorithms for a wide variety of external and internal validation indices. In particular, results clearly evidence the impact of the visual and logical modalities towards segmentation performance.

查看原文本刊更多论文

基于自组织多目标聚类的多模态网页分割

网页分割(Web page segmentation, WPS)的目的是将网页分割成具有连贯的内语义和间语义的不同部分。通过证明网页的形态-倾向语义，WPS传统上被用来区分信息和非信息内容，但它也证明了它在视障人士非线性访问网络信息的背景下的关键作用。为此，已经提出了大量依赖于视觉、逻辑和/或文本线索的特殊解决方案。然而，这种方法高度依赖于手动调整的启发式，并且依赖于参数。为了克服这些缺点，提出了原则性框架，为实现最优解提供了理论基础。然而，现有的方法只结合了很少的判别特征，并且没有定义自动选择最优段数量的策略。在本文中，我们提出了一种称为MCS的多目标聚类技术，该技术依赖于\( K \) -means，其中(1)视觉、逻辑和文本线索都以早期融合的方式组合在一起;(2)进化过程自动发现聚类(片段)的最佳数量以及种子的正确定位。因此，我们的建议是无参数的，结合了许多不同的模式，不依赖于手动调整的启发式，并且可以在任何网页上不受任何约束地运行。对两个不同的任务进行详尽的评估，其中(1)必须发现的片段数量或(2)相对于手头的任务，集群的数量是固定的，表明MCS在各种外部和内部验证指标上比大多数竞争激烈和最新的算法有了巨大的改进。特别是，结果清楚地证明了视觉和逻辑模式对分割性能的影响。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Information Systems (TOIS)

自引率

0.00%

发文量