{"title":"cSmartML-Glassbox: Increasing Transparency and Controllability in Automated Clustering","authors":"Radwa El Shawi, S. Sakr","doi":"10.1109/ICDMW58026.2022.00015","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms have been widely employed in various applications and fields. Novel technologies in automated machine learning (AutoML) ease algorithm selection and hyperparameter optimization complexity. AutoML frame-works have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such frameworks as black-box can leave machine learning practitioners without insights into the inner working of the AutoML process and hence influence their trust in the models produced. In addition, excluding humans from the loop creates several limitations. For example, most of the current AutoML frameworks ignore the user preferences on defining or controlling the search space, which consequently can impact the performance of the models produced and the acceptance of these models by the end-users. The research in the area of transparency and controllability of AutoML has attracted much interest lately, both in academia and industry. However, existing tools are usually restricted to supervised learning tasks such as classification and regression, while unsupervised learning, particularly clustering, remains a largely unexplored problem. Motivated by these shortcomings, we design and implement cSmartML-GlassBox, an interactive visualization tool that en-ables users to refine the search space of AutoML and analyze the results. cSmartML-GlassBox is equipped with a recommendation engine to recommend a time budget that is likely adequate for a new dataset to obtain well-performing pipeline. In addition, the tool supports multi-granularity visualization to enable machine learning practitioners to monitor the AutoML process, analyze the explored configurations and refine/control the search space. Furthermore, cSmartML-GlassBox is equipped with a logging mechanism such that repeated runs on the same dataset can be more effective by avoiding evaluating the same previously considered configurations. We demonstrate the effectiveness and usability of the cSmartML-GlassBox through a user evaluation study with 23 participants and an expert-based usability study based on four experts. We find that the proposed tool increases users' understanding and trust in the AutoML frameworks.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning algorithms have been widely employed in various applications and fields. Novel technologies in automated machine learning (AutoML) ease algorithm selection and hyperparameter optimization complexity. AutoML frame-works have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such frameworks as black-box can leave machine learning practitioners without insights into the inner working of the AutoML process and hence influence their trust in the models produced. In addition, excluding humans from the loop creates several limitations. For example, most of the current AutoML frameworks ignore the user preferences on defining or controlling the search space, which consequently can impact the performance of the models produced and the acceptance of these models by the end-users. The research in the area of transparency and controllability of AutoML has attracted much interest lately, both in academia and industry. However, existing tools are usually restricted to supervised learning tasks such as classification and regression, while unsupervised learning, particularly clustering, remains a largely unexplored problem. Motivated by these shortcomings, we design and implement cSmartML-GlassBox, an interactive visualization tool that en-ables users to refine the search space of AutoML and analyze the results. cSmartML-GlassBox is equipped with a recommendation engine to recommend a time budget that is likely adequate for a new dataset to obtain well-performing pipeline. In addition, the tool supports multi-granularity visualization to enable machine learning practitioners to monitor the AutoML process, analyze the explored configurations and refine/control the search space. Furthermore, cSmartML-GlassBox is equipped with a logging mechanism such that repeated runs on the same dataset can be more effective by avoiding evaluating the same previously considered configurations. We demonstrate the effectiveness and usability of the cSmartML-GlassBox through a user evaluation study with 23 participants and an expert-based usability study based on four experts. We find that the proposed tool increases users' understanding and trust in the AutoML frameworks.