组合鲁棒聚类问题的有效启发式技术

Yunhe Xu, Chenchen Wu, Ling Gai, Lu Han
{"title":"组合鲁棒聚类问题的有效启发式技术","authors":"Yunhe Xu, Chenchen Wu, Ling Gai, Lu Han","doi":"10.1142/s0217595922400097","DOIUrl":null,"url":null,"abstract":"Clustering is one of the most important problems in the fields of data mining, machine learning, and biological population division, etc. Moreover, robust variant for [Formula: see text]-means problem, which includes [Formula: see text]-means with penalties and [Formula: see text]-means with outliers, is also an active research branch. Most of these problems are NP-hard even the most classical problem, [Formula: see text]-means problem. For the NP-hard problems, the heuristic algorithm is a powerful method. When the quality of the output can be guaranteed, the algorithm is called an approximation algorithm. In this paper, combining two types of robust settings, we consider [Formula: see text]-means problem with penalties and outliers ([Formula: see text]-MPO). In the [Formula: see text]-MPO, we are given an [Formula: see text]-point set [Formula: see text], a penalty cost [Formula: see text] for each [Formula: see text], an integer [Formula: see text], and an integer [Formula: see text]. The target is to find a center subset [Formula: see text] with [Formula: see text], a penalty subset [Formula: see text] and an outlier subset [Formula: see text] with [Formula: see text], such that the sum of the total costs, including the connection cost and the penalty cost, is minimized. We offer an approximation algorithm using a heuristic local search scheme. Based on a single-swap manipulation, we obtain [Formula: see text]-approximation algorithm.","PeriodicalId":8478,"journal":{"name":"Asia Pac. J. Oper. Res.","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Effective Heuristic Techniques for Combined Robust Clustering Problem\",\"authors\":\"Yunhe Xu, Chenchen Wu, Ling Gai, Lu Han\",\"doi\":\"10.1142/s0217595922400097\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering is one of the most important problems in the fields of data mining, machine learning, and biological population division, etc. Moreover, robust variant for [Formula: see text]-means problem, which includes [Formula: see text]-means with penalties and [Formula: see text]-means with outliers, is also an active research branch. Most of these problems are NP-hard even the most classical problem, [Formula: see text]-means problem. For the NP-hard problems, the heuristic algorithm is a powerful method. When the quality of the output can be guaranteed, the algorithm is called an approximation algorithm. In this paper, combining two types of robust settings, we consider [Formula: see text]-means problem with penalties and outliers ([Formula: see text]-MPO). In the [Formula: see text]-MPO, we are given an [Formula: see text]-point set [Formula: see text], a penalty cost [Formula: see text] for each [Formula: see text], an integer [Formula: see text], and an integer [Formula: see text]. The target is to find a center subset [Formula: see text] with [Formula: see text], a penalty subset [Formula: see text] and an outlier subset [Formula: see text] with [Formula: see text], such that the sum of the total costs, including the connection cost and the penalty cost, is minimized. We offer an approximation algorithm using a heuristic local search scheme. Based on a single-swap manipulation, we obtain [Formula: see text]-approximation algorithm.\",\"PeriodicalId\":8478,\"journal\":{\"name\":\"Asia Pac. J. Oper. Res.\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Asia Pac. J. Oper. Res.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1142/s0217595922400097\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Asia Pac. J. Oper. Res.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s0217595922400097","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

聚类是数据挖掘、机器学习、生物种群划分等领域的重要问题之一。此外,[公式:见文]均值问题的鲁棒变体,包括[公式:见文]-带惩罚的均值和[公式:见文]-带异常值的均值,也是一个活跃的研究分支。这些问题中的大多数都是np困难问题,即使是最经典的问题,[公式:见文本]-均值问题。对于np困难问题,启发式算法是一种强有力的方法。当能保证输出的质量时,该算法称为近似算法。在本文中,结合两种类型的鲁棒设置,我们考虑[公式:见文本]-具有惩罚和异常值([公式:见文本]-MPO)的均值问题。在[公式:见文本]-MPO中,我们得到一个[公式:见文本]-点集[公式:见文本],每个[公式:见文本]的惩罚成本[公式:见文本],一个整数[公式:见文本]和一个整数[公式:见文本]。目标是用[公式:见文]找到一个中心子集[公式:见文],一个惩罚子集[公式:见文]和一个离群子集[公式:见文],使包括连接成本和惩罚成本在内的总成本之和最小。我们提出了一种使用启发式局部搜索方案的近似算法。基于单交换操作,我们得到[公式:见文本]-近似算法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Effective Heuristic Techniques for Combined Robust Clustering Problem
Clustering is one of the most important problems in the fields of data mining, machine learning, and biological population division, etc. Moreover, robust variant for [Formula: see text]-means problem, which includes [Formula: see text]-means with penalties and [Formula: see text]-means with outliers, is also an active research branch. Most of these problems are NP-hard even the most classical problem, [Formula: see text]-means problem. For the NP-hard problems, the heuristic algorithm is a powerful method. When the quality of the output can be guaranteed, the algorithm is called an approximation algorithm. In this paper, combining two types of robust settings, we consider [Formula: see text]-means problem with penalties and outliers ([Formula: see text]-MPO). In the [Formula: see text]-MPO, we are given an [Formula: see text]-point set [Formula: see text], a penalty cost [Formula: see text] for each [Formula: see text], an integer [Formula: see text], and an integer [Formula: see text]. The target is to find a center subset [Formula: see text] with [Formula: see text], a penalty subset [Formula: see text] and an outlier subset [Formula: see text] with [Formula: see text], such that the sum of the total costs, including the connection cost and the penalty cost, is minimized. We offer an approximation algorithm using a heuristic local search scheme. Based on a single-swap manipulation, we obtain [Formula: see text]-approximation algorithm.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信