在各种加权方案下对区间值观测值实施顶点主成分分析并应用于数据挖掘

Dhaka University Journal of Science Pub Date : 2024-03-25 DOI:10.3329/dujs.v72i1.71184

Md Anwarul Islam Bhuiyan, S. Jahan, Mohammad Babul Hasan

{"title":"在各种加权方案下对区间值观测值实施顶点主成分分析并应用于数据挖掘","authors":"Md Anwarul Islam Bhuiyan, S. Jahan, Mohammad Babul Hasan","doi":"10.3329/dujs.v72i1.71184","DOIUrl":null,"url":null,"abstract":"Data mining is the technique for deriving valuable data from a more extensive collection of raw data. It is the process of looking for irregularities, trends, and correlations in huge data sets in order to forecast results. Although a number of techniques have been developed to perform data mining on conventional data in the past years, there are huge scope to work with Interval Valued data (IVD). Working with IVD has been shown to be of significant importance when it comes to identifying the objective entity in a precise manner or representing incomplete knowledge on life situations. Unlike classical data where each object is represented by a point, in IVD the objects are represented by regions in Rp. In this paper, an extension of Principle Component Analysis (PCA) known as Vertices Principal Components method for interval-valued information has been explored. It additionally incorporated the relative contributions of the vertices depending on different choices of weighting schemes. A new idea for classification of the supervised IVD is proposed which is based on the idea of K-Nearest Neighbor (KNN) technique. The proposed approach is implemented on several benchmarking data sets. Numerical results suggest the proper choice of weighting schemes for each of the data set that will lead to better recognition rate.\nDhaka Univ. J. Sci. 72(1): 46-55, 2024 (January)","PeriodicalId":11280,"journal":{"name":"Dhaka University Journal of Science","volume":" May","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementing Vertices Principal Component Analysis under Various Weighting Schemes for Interval Valued Observations with Applications to Data Mining\",\"authors\":\"Md Anwarul Islam Bhuiyan, S. Jahan, Mohammad Babul Hasan\",\"doi\":\"10.3329/dujs.v72i1.71184\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data mining is the technique for deriving valuable data from a more extensive collection of raw data. It is the process of looking for irregularities, trends, and correlations in huge data sets in order to forecast results. Although a number of techniques have been developed to perform data mining on conventional data in the past years, there are huge scope to work with Interval Valued data (IVD). Working with IVD has been shown to be of significant importance when it comes to identifying the objective entity in a precise manner or representing incomplete knowledge on life situations. Unlike classical data where each object is represented by a point, in IVD the objects are represented by regions in Rp. In this paper, an extension of Principle Component Analysis (PCA) known as Vertices Principal Components method for interval-valued information has been explored. It additionally incorporated the relative contributions of the vertices depending on different choices of weighting schemes. A new idea for classification of the supervised IVD is proposed which is based on the idea of K-Nearest Neighbor (KNN) technique. The proposed approach is implemented on several benchmarking data sets. Numerical results suggest the proper choice of weighting schemes for each of the data set that will lead to better recognition rate.\\nDhaka Univ. J. Sci. 72(1): 46-55, 2024 (January)\",\"PeriodicalId\":11280,\"journal\":{\"name\":\"Dhaka University Journal of Science\",\"volume\":\" May\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Dhaka University Journal of Science\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3329/dujs.v72i1.71184\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Dhaka University Journal of Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3329/dujs.v72i1.71184","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

数据挖掘是从更广泛的原始数据集合中获取有价值数据的技术。它是在庞大的数据集中寻找不规则性、趋势和相关性，从而预测结果的过程。尽管在过去几年中已经开发出了许多对传统数据进行数据挖掘的技术，但对区间值数据（IVD）进行数据挖掘还有很大的发展空间。事实证明，在以精确的方式识别客观实体或表示生活场景中的不完整知识时，使用 IVD 具有重要意义。与传统数据中每个对象由一个点表示不同，在 IVD 中，对象由 Rp 中的区域表示。此外，它还纳入了顶点的相对贡献，这取决于加权方案的不同选择。基于 K-Nearest Neighbor (KNN) 技术的理念，提出了对有监督 IVD 进行分类的新思路。我们在多个基准数据集上实施了所提出的方法。数值结果表明，为每个数据集选择适当的加权方案将提高识别率：46-55, 2024 (January)

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Implementing Vertices Principal Component Analysis under Various Weighting Schemes for Interval Valued Observations with Applications to Data Mining

Data mining is the technique for deriving valuable data from a more extensive collection of raw data. It is the process of looking for irregularities, trends, and correlations in huge data sets in order to forecast results. Although a number of techniques have been developed to perform data mining on conventional data in the past years, there are huge scope to work with Interval Valued data (IVD). Working with IVD has been shown to be of significant importance when it comes to identifying the objective entity in a precise manner or representing incomplete knowledge on life situations. Unlike classical data where each object is represented by a point, in IVD the objects are represented by regions in Rp. In this paper, an extension of Principle Component Analysis (PCA) known as Vertices Principal Components method for interval-valued information has been explored. It additionally incorporated the relative contributions of the vertices depending on different choices of weighting schemes. A new idea for classification of the supervised IVD is proposed which is based on the idea of K-Nearest Neighbor (KNN) technique. The proposed approach is implemented on several benchmarking data sets. Numerical results suggest the proper choice of weighting schemes for each of the data set that will lead to better recognition rate. Dhaka Univ. J. Sci. 72(1): 46-55, 2024 (January)

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Dhaka University Journal of Science

自引率

0.00%

发文量