Design Smells in Deep Learning Programs: An Empirical Study

2021 IEEE International Conference on Software Maintenance and Evolution (ICSME) Pub Date : 2021-07-05 DOI:10.26226/morressier.613b5418842293c031b5b61d

Amin Nikanjam, Foutse Khomh

{"title":"Design Smells in Deep Learning Programs: An Empirical Study","authors":"Amin Nikanjam, Foutse Khomh","doi":"10.26226/morressier.613b5418842293c031b5b61d","DOIUrl":null,"url":null,"abstract":"Nowadays, we are witnessing an increasing adoption of Deep Learning (DL) based software systems in many industries. Designing a DL program requires constructing a deep neural network (DNN) and then training it on a dataset. This process requires that developers make multiple architectural (e.g., type, size, number, and order of layers) and configuration (e.g., optimizer, regularization methods, and activation functions) choices that affect the quality of the DL models, and consequently software quality. An under-specified or poorly-designed DL model may train successfully but is likely to perform poorly when deployed in production. Design smells in DL programs are poor design and-or configuration decisions taken during the development of DL components, that are likely to have a negative impact on the performance (i.e., prediction accuracy) and then quality of DL based software systems. In this paper, we present a catalogue of 8 design smells for a popular DL architecture, namely deep Feedforward Neural Networks which is widely employed in industrial applications. The design smells were identified through a review of the existing literature on DL design and a manual inspection of 659 DL programs with performance issues and design inefficiencies. The smells are specified by describing their context, consequences, and recommended refactorings. To provide empirical evidence on the relevance and perceived impact of the proposed design smells, we conducted a survey with 81 DL developers. In general, the developers perceived the proposed design smells as reflective of design or implementation problems, with agreement levels varying between 47% and 68%.","PeriodicalId":205629,"journal":{"name":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","volume":"11 9","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.26226/morressier.613b5418842293c031b5b61d","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Nowadays, we are witnessing an increasing adoption of Deep Learning (DL) based software systems in many industries. Designing a DL program requires constructing a deep neural network (DNN) and then training it on a dataset. This process requires that developers make multiple architectural (e.g., type, size, number, and order of layers) and configuration (e.g., optimizer, regularization methods, and activation functions) choices that affect the quality of the DL models, and consequently software quality. An under-specified or poorly-designed DL model may train successfully but is likely to perform poorly when deployed in production. Design smells in DL programs are poor design and-or configuration decisions taken during the development of DL components, that are likely to have a negative impact on the performance (i.e., prediction accuracy) and then quality of DL based software systems. In this paper, we present a catalogue of 8 design smells for a popular DL architecture, namely deep Feedforward Neural Networks which is widely employed in industrial applications. The design smells were identified through a review of the existing literature on DL design and a manual inspection of 659 DL programs with performance issues and design inefficiencies. The smells are specified by describing their context, consequences, and recommended refactorings. To provide empirical evidence on the relevance and perceived impact of the proposed design smells, we conducted a survey with 81 DL developers. In general, the developers perceived the proposed design smells as reflective of design or implementation problems, with agreement levels varying between 47% and 68%.

查看原文本刊更多论文

深度学习程序中的设计气味:一项实证研究

如今，我们看到许多行业越来越多地采用基于深度学习(DL)的软件系统。设计一个深度学习程序需要构建一个深度神经网络(DNN)，然后在数据集上训练它。这个过程要求开发人员做出多种体系结构(例如，层的类型、大小、数量和顺序)和配置(例如，优化器、正则化方法和激活函数)选择，这些选择会影响DL模型的质量，从而影响软件质量。未指定或设计不良的深度学习模型可能训练成功，但在生产环境中部署时可能表现不佳。深度学习程序中的设计气味是在深度学习组件开发过程中做出的糟糕的设计和/或配置决策，这可能会对基于深度学习的软件系统的性能(即预测准确性)和质量产生负面影响。在本文中，我们提出了一种流行的深度学习架构的8种设计气味的目录，即广泛应用于工业应用的深度前馈神经网络。通过对现有DL设计文献的回顾和对659个存在性能问题和设计效率低下的DL程序的人工检查，确定了设计气味。气味是通过描述它们的上下文、结果和推荐的重构来指定的。为了提供有关所提议的设计气味的相关性和感知影响的经验证据，我们对81名DL开发人员进行了调查。一般来说，开发人员认为建议的设计气味反映了设计或实现问题，同意程度在47%到68%之间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Conference on Software Maintenance and Evolution (ICSME)

自引率

0.00%

发文量