INTERFACES SIMILARITY ANALYSIS FOR PROGRESSIVE WEB APPS AND WEB-APPLICATIONS BASED ON DISTILBERT TRANSFORMER

H. Yehoshyna, S. M. Voronoy, O. Polikarovskykh, R. Gokhman
{"title":"INTERFACES SIMILARITY ANALYSIS FOR PROGRESSIVE WEB APPS AND WEB-APPLICATIONS BASED ON DISTILBERT TRANSFORMER","authors":"H. Yehoshyna, S. M. Voronoy, O. Polikarovskykh, R. Gokhman","doi":"10.31474/1996-1588-2023-1-36-51-60","DOIUrl":null,"url":null,"abstract":"An approach to automated testing of components of Progressive Web Applications interfaces by determining their relevance to elements of the corresponding web versions of applications is proposed. An analysis of modern trends and existing categories in the field of Web Mining was carried out. It is shown that the predominant trend in the analysis of the interface structures of modern web applications is the use of Deep Learning technologies. Features and functioning of the latest Transformers neural network architecture are considered. The choice of the Transformers type model to determine the correspondence between the site structure and the PWA application interface is justified. It is shown that in the comparison of fragments of the interfaces of the web service and the PWA application, some elements have more impact (weight) than others. It is proposed to use the mechanism of multidimensional \"self-attention\" to take into account this feature of the content. It is shown that the analysis of correspondence of interfaces is a task of binary classification. Features of transformers of the Bidirectional Encoder Representations (BERT) type are viewed. Pretrained BERT model can be configured with only one additional output layer to create modern and powerful models for a wide range of problems. It is proposed to use transfer learning, namely the DistilBERT model and its fine tuning using the DistilBertForSequenceClassification class. For the basic architecture of DistillBert (embedding and encoder layers), the weights of the English-language model \"distilbert-base-uncased-finetuned-sst-2-english\" were used. The model was optimized using a modification of the Adam stochastic gradient descent method. It is also suggested to use a low learning rate to avoid \"forgetting\". The features of data preprocessing using DistilBertTokenizer are shown. The architecture of the model was designed and its research was done based on data set of CSS properties, which provide styling and layout of interface elements.","PeriodicalId":104072,"journal":{"name":"Scientific papers of Donetsk National Technical University. Series: Informatics, Cybernetics and Computer Science","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific papers of Donetsk National Technical University. Series: Informatics, Cybernetics and Computer Science","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.31474/1996-1588-2023-1-36-51-60","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

An approach to automated testing of components of Progressive Web Applications interfaces by determining their relevance to elements of the corresponding web versions of applications is proposed. An analysis of modern trends and existing categories in the field of Web Mining was carried out. It is shown that the predominant trend in the analysis of the interface structures of modern web applications is the use of Deep Learning technologies. Features and functioning of the latest Transformers neural network architecture are considered. The choice of the Transformers type model to determine the correspondence between the site structure and the PWA application interface is justified. It is shown that in the comparison of fragments of the interfaces of the web service and the PWA application, some elements have more impact (weight) than others. It is proposed to use the mechanism of multidimensional "self-attention" to take into account this feature of the content. It is shown that the analysis of correspondence of interfaces is a task of binary classification. Features of transformers of the Bidirectional Encoder Representations (BERT) type are viewed. Pretrained BERT model can be configured with only one additional output layer to create modern and powerful models for a wide range of problems. It is proposed to use transfer learning, namely the DistilBERT model and its fine tuning using the DistilBertForSequenceClassification class. For the basic architecture of DistillBert (embedding and encoder layers), the weights of the English-language model "distilbert-base-uncased-finetuned-sst-2-english" were used. The model was optimized using a modification of the Adam stochastic gradient descent method. It is also suggested to use a low learning rate to avoid "forgetting". The features of data preprocessing using DistilBertTokenizer are shown. The architecture of the model was designed and its research was done based on data set of CSS properties, which provide styling and layout of interface elements.
基于蒸馏器转换器的渐进式web应用和web应用界面相似性分析
提出了一种通过确定渐进式Web应用程序接口组件与相应Web版本应用程序元素的相关性来自动测试组件的方法。对Web挖掘领域的现代趋势和现有类别进行了分析。分析现代web应用程序的接口结构的主要趋势是使用深度学习技术。考虑了最新的变形金刚神经网络结构的特点和功能。选择transformer类型模型来确定站点结构和PWA应用程序接口之间的对应关系是合理的。结果表明,在web服务和PWA应用程序的接口片段的比较中,某些元素比其他元素具有更大的影响(权重)。提出利用多维度的“自注意”机制来考虑内容的这一特点。结果表明,界面对应性分析是一个二元分类问题。观察了双向编码器表示(BERT)类型的变压器的特征。预训练的BERT模型只需要配置一个额外的输出层,就可以为广泛的问题创建现代而强大的模型。建议使用迁移学习,即蒸馏器模型及其使用蒸馏器forsequencecclassification类的微调。对于DistillBert的基本架构(嵌入层和编码器层),使用了英语模型“distilbert-base-uncase -fine - tuned-sst-2-english”的权重。采用改进的Adam随机梯度下降法对模型进行优化。还建议使用较低的学习速度,以避免“遗忘”。介绍了使用DistilBertTokenizer进行数据预处理的特点。基于CSS属性数据集,设计了该模型的体系结构并对其进行了研究,该数据集提供了界面元素的样式和布局。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信