J. M. Sousa, Roney L. S. Santos, L. A. Lopes, V. Machado, Ivan Saraiva Silva
{"title":"使用监督机器学习对离散和连续数据进行自动标记","authors":"J. M. Sousa, Roney L. S. Santos, L. A. Lopes, V. Machado, Ivan Saraiva Silva","doi":"10.1109/SCCC.2016.7836060","DOIUrl":null,"url":null,"abstract":"The clustering problem has been considered one of the most relevant problems in the research area of unsupervised learning. However, the comprehension and definition of such clusters is not a trivial task, making necessary their identification, i.e., assign a label to each cluster. To address the problem of labelling learning, this paper presents a methodology based on techniques for supervised learning, unsupervised learning and a discretization model, aimed to increasing the speed and accuracy of the algorithm. Thus, a method with unsupervised learning algorithm is applied to the clustering problem, and the supervised learning algorithm is responsible for detecting the meaningful attributes to define each formed cluster. Some strategies are used to form a methodology that presents a label (based on attributes and values) for each provided cluster. Such methodology is applied to one database, in which acceptable results were achieved with an average that exceeds 92.89% of correctly labelled elements.","PeriodicalId":432676,"journal":{"name":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Automatic labelling of clusters with discrete and continuous data using supervised machine learning\",\"authors\":\"J. M. Sousa, Roney L. S. Santos, L. A. Lopes, V. Machado, Ivan Saraiva Silva\",\"doi\":\"10.1109/SCCC.2016.7836060\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The clustering problem has been considered one of the most relevant problems in the research area of unsupervised learning. However, the comprehension and definition of such clusters is not a trivial task, making necessary their identification, i.e., assign a label to each cluster. To address the problem of labelling learning, this paper presents a methodology based on techniques for supervised learning, unsupervised learning and a discretization model, aimed to increasing the speed and accuracy of the algorithm. Thus, a method with unsupervised learning algorithm is applied to the clustering problem, and the supervised learning algorithm is responsible for detecting the meaningful attributes to define each formed cluster. Some strategies are used to form a methodology that presents a label (based on attributes and values) for each provided cluster. Such methodology is applied to one database, in which acceptable results were achieved with an average that exceeds 92.89% of correctly labelled elements.\",\"PeriodicalId\":432676,\"journal\":{\"name\":\"2016 35th International Conference of the Chilean Computer Science Society (SCCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 35th International Conference of the Chilean Computer Science Society (SCCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCCC.2016.7836060\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 35th International Conference of the Chilean Computer Science Society (SCCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCCC.2016.7836060","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic labelling of clusters with discrete and continuous data using supervised machine learning
The clustering problem has been considered one of the most relevant problems in the research area of unsupervised learning. However, the comprehension and definition of such clusters is not a trivial task, making necessary their identification, i.e., assign a label to each cluster. To address the problem of labelling learning, this paper presents a methodology based on techniques for supervised learning, unsupervised learning and a discretization model, aimed to increasing the speed and accuracy of the algorithm. Thus, a method with unsupervised learning algorithm is applied to the clustering problem, and the supervised learning algorithm is responsible for detecting the meaningful attributes to define each formed cluster. Some strategies are used to form a methodology that presents a label (based on attributes and values) for each provided cluster. Such methodology is applied to one database, in which acceptable results were achieved with an average that exceeds 92.89% of correctly labelled elements.