{"title":"初始化在k均值聚类中的重要性","authors":"Anubhav Gupta, Antriksh Tomer, S. Dahiya","doi":"10.1109/ICAECT54875.2022.9807996","DOIUrl":null,"url":null,"abstract":"Data clustering is a method of visualizing the data in such a way that enables the researcher to see similar patterns formed in the data and these lead to conclusions that can be helpful to interpret the data and could be further used for other research purposes. In this paper the focus would be on the initialization technique used and would present how an improper initialization of centroid could lead to bad or unfruitful results, not only this the complexity of the overall algorithm depends upon the type of initialization used. Thus, study compares various initialization techniques and their respective research work to come upon a study that would help the researcher to get an insight of the available techniques and thus choose the one suitable. This research would focus on the nature of the data presented and would see how different types of Datasets get affected by the choice of initialization. Along with this would also analyze the impact of repeating the K-Means Clustering Algorithm on the results.","PeriodicalId":346658,"journal":{"name":"2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Importance of Initialization in K-Means Clustering\",\"authors\":\"Anubhav Gupta, Antriksh Tomer, S. Dahiya\",\"doi\":\"10.1109/ICAECT54875.2022.9807996\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Data clustering is a method of visualizing the data in such a way that enables the researcher to see similar patterns formed in the data and these lead to conclusions that can be helpful to interpret the data and could be further used for other research purposes. In this paper the focus would be on the initialization technique used and would present how an improper initialization of centroid could lead to bad or unfruitful results, not only this the complexity of the overall algorithm depends upon the type of initialization used. Thus, study compares various initialization techniques and their respective research work to come upon a study that would help the researcher to get an insight of the available techniques and thus choose the one suitable. This research would focus on the nature of the data presented and would see how different types of Datasets get affected by the choice of initialization. Along with this would also analyze the impact of repeating the K-Means Clustering Algorithm on the results.\",\"PeriodicalId\":346658,\"journal\":{\"name\":\"2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)\",\"volume\":\"98 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-04-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAECT54875.2022.9807996\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Second International Conference on Advances in Electrical, Computing, Communication and Sustainable Technologies (ICAECT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAECT54875.2022.9807996","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Importance of Initialization in K-Means Clustering
Data clustering is a method of visualizing the data in such a way that enables the researcher to see similar patterns formed in the data and these lead to conclusions that can be helpful to interpret the data and could be further used for other research purposes. In this paper the focus would be on the initialization technique used and would present how an improper initialization of centroid could lead to bad or unfruitful results, not only this the complexity of the overall algorithm depends upon the type of initialization used. Thus, study compares various initialization techniques and their respective research work to come upon a study that would help the researcher to get an insight of the available techniques and thus choose the one suitable. This research would focus on the nature of the data presented and would see how different types of Datasets get affected by the choice of initialization. Along with this would also analyze the impact of repeating the K-Means Clustering Algorithm on the results.