cSmartML-Glassbox: Increasing Transparency and Controllability in Automated Clustering

Radwa El Shawi, S. Sakr
{"title":"cSmartML-Glassbox: Increasing Transparency and Controllability in Automated Clustering","authors":"Radwa El Shawi, S. Sakr","doi":"10.1109/ICDMW58026.2022.00015","DOIUrl":null,"url":null,"abstract":"Machine learning algorithms have been widely employed in various applications and fields. Novel technologies in automated machine learning (AutoML) ease algorithm selection and hyperparameter optimization complexity. AutoML frame-works have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such frameworks as black-box can leave machine learning practitioners without insights into the inner working of the AutoML process and hence influence their trust in the models produced. In addition, excluding humans from the loop creates several limitations. For example, most of the current AutoML frameworks ignore the user preferences on defining or controlling the search space, which consequently can impact the performance of the models produced and the acceptance of these models by the end-users. The research in the area of transparency and controllability of AutoML has attracted much interest lately, both in academia and industry. However, existing tools are usually restricted to supervised learning tasks such as classification and regression, while unsupervised learning, particularly clustering, remains a largely unexplored problem. Motivated by these shortcomings, we design and implement cSmartML-GlassBox, an interactive visualization tool that en-ables users to refine the search space of AutoML and analyze the results. cSmartML-GlassBox is equipped with a recommendation engine to recommend a time budget that is likely adequate for a new dataset to obtain well-performing pipeline. In addition, the tool supports multi-granularity visualization to enable machine learning practitioners to monitor the AutoML process, analyze the explored configurations and refine/control the search space. Furthermore, cSmartML-GlassBox is equipped with a logging mechanism such that repeated runs on the same dataset can be more effective by avoiding evaluating the same previously considered configurations. We demonstrate the effectiveness and usability of the cSmartML-GlassBox through a user evaluation study with 23 participants and an expert-based usability study based on four experts. We find that the proposed tool increases users' understanding and trust in the AutoML frameworks.","PeriodicalId":146687,"journal":{"name":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW58026.2022.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning algorithms have been widely employed in various applications and fields. Novel technologies in automated machine learning (AutoML) ease algorithm selection and hyperparameter optimization complexity. AutoML frame-works have achieved notable success in hyperparameter tuning and surpassed the performance of human experts. However, depending on such frameworks as black-box can leave machine learning practitioners without insights into the inner working of the AutoML process and hence influence their trust in the models produced. In addition, excluding humans from the loop creates several limitations. For example, most of the current AutoML frameworks ignore the user preferences on defining or controlling the search space, which consequently can impact the performance of the models produced and the acceptance of these models by the end-users. The research in the area of transparency and controllability of AutoML has attracted much interest lately, both in academia and industry. However, existing tools are usually restricted to supervised learning tasks such as classification and regression, while unsupervised learning, particularly clustering, remains a largely unexplored problem. Motivated by these shortcomings, we design and implement cSmartML-GlassBox, an interactive visualization tool that en-ables users to refine the search space of AutoML and analyze the results. cSmartML-GlassBox is equipped with a recommendation engine to recommend a time budget that is likely adequate for a new dataset to obtain well-performing pipeline. In addition, the tool supports multi-granularity visualization to enable machine learning practitioners to monitor the AutoML process, analyze the explored configurations and refine/control the search space. Furthermore, cSmartML-GlassBox is equipped with a logging mechanism such that repeated runs on the same dataset can be more effective by avoiding evaluating the same previously considered configurations. We demonstrate the effectiveness and usability of the cSmartML-GlassBox through a user evaluation study with 23 participants and an expert-based usability study based on four experts. We find that the proposed tool increases users' understanding and trust in the AutoML frameworks.
cSmartML-Glassbox:在自动集群中增加透明度和可控性
机器学习算法已广泛应用于各种应用和领域。自动化机器学习(AutoML)中的新技术简化了算法选择和超参数优化的复杂性。AutoML框架在超参数调优方面取得了显著的成功,超越了人类专家的表现。然而,依赖于黑盒这样的框架可能会让机器学习从业者无法深入了解AutoML过程的内部工作,从而影响他们对生成的模型的信任。此外,将人类排除在循环之外会产生一些限制。例如,大多数当前的AutoML框架都忽略了用户在定义或控制搜索空间时的偏好,这可能会影响生成的模型的性能以及最终用户对这些模型的接受程度。近年来,自动化系统的透明性和可控性研究引起了学术界和工业界的广泛关注。然而,现有的工具通常仅限于监督学习任务,如分类和回归,而无监督学习,特别是聚类,仍然是一个很大程度上未被探索的问题。针对这些不足,我们设计并实现了交互式可视化工具cSmartML-GlassBox,使用户能够细化AutoML的搜索空间并分析结果。cSmartML-GlassBox配备了一个推荐引擎,可以为新数据集推荐一个可能足够的时间预算,以获得性能良好的管道。此外,该工具支持多粒度可视化,使机器学习从业者能够监控AutoML过程,分析探索的配置并优化/控制搜索空间。此外,cSmartML-GlassBox配备了日志记录机制,通过避免评估相同的先前考虑的配置,在同一数据集上重复运行可以更有效。我们通过23名参与者的用户评估研究和4名专家的基于专家的可用性研究来证明cSmartML-GlassBox的有效性和可用性。我们发现所提出的工具增加了用户对AutoML框架的理解和信任。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信