Classifying Mobile Applications Using Word Embeddings

Fahime Ebrahimi, Miroslav Tushev, Anas Mahmoud
{"title":"Classifying Mobile Applications Using Word Embeddings","authors":"Fahime Ebrahimi, Miroslav Tushev, Anas Mahmoud","doi":"10.1145/3474827","DOIUrl":null,"url":null,"abstract":"Modern application stores enable developers to classify their apps by choosing from a set of generic categories, or genres, such as health, games, and music. These categories are typically static—new categories do not necessarily emerge over time to reflect innovations in the mobile software landscape. With thousands of apps classified under each category, locating apps that match a specific consumer interest can be a challenging task. To overcome this challenge, in this article, we propose an automated approach for classifying mobile apps into more focused categories of functionally related application domains. Our aim is to enhance apps visibility and discoverability. Specifically, we employ word embeddings to generate numeric semantic representations of app descriptions. These representations are then classified to generate more cohesive categories of apps. Our empirical investigation is conducted using a dataset of 600 apps, sampled from the Education, Health&Fitness, and Medical categories of the Apple App Store. The results show that our classification algorithms achieve their best performance when app descriptions are vectorized using GloVe, a count-based model of word embeddings. Our findings are further validated using a dataset of Sharing Economy apps and the results are evaluated by 12 human subjects. The results show that GloVe combined with Support Vector Machines can produce app classifications that are aligned to a large extent with human-generated classifications.","PeriodicalId":7398,"journal":{"name":"ACM Transactions on Software Engineering and Methodology (TOSEM)","volume":"176 1","pages":"1 - 30"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Software Engineering and Methodology (TOSEM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3474827","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Modern application stores enable developers to classify their apps by choosing from a set of generic categories, or genres, such as health, games, and music. These categories are typically static—new categories do not necessarily emerge over time to reflect innovations in the mobile software landscape. With thousands of apps classified under each category, locating apps that match a specific consumer interest can be a challenging task. To overcome this challenge, in this article, we propose an automated approach for classifying mobile apps into more focused categories of functionally related application domains. Our aim is to enhance apps visibility and discoverability. Specifically, we employ word embeddings to generate numeric semantic representations of app descriptions. These representations are then classified to generate more cohesive categories of apps. Our empirical investigation is conducted using a dataset of 600 apps, sampled from the Education, Health&Fitness, and Medical categories of the Apple App Store. The results show that our classification algorithms achieve their best performance when app descriptions are vectorized using GloVe, a count-based model of word embeddings. Our findings are further validated using a dataset of Sharing Economy apps and the results are evaluated by 12 human subjects. The results show that GloVe combined with Support Vector Machines can produce app classifications that are aligned to a large extent with human-generated classifications.
使用词嵌入对移动应用进行分类
现代应用商店允许开发者从一组通用类别或类型(如健康、游戏和音乐)中选择应用进行分类。这些类别通常是静态的——新的类别不一定会随着时间的推移而出现,以反映移动软件领域的创新。由于每个类别下都有成千上万的应用程序,定位符合特定消费者兴趣的应用程序可能是一项具有挑战性的任务。为了克服这一挑战,在本文中,我们提出了一种自动方法,将移动应用程序分类为功能相关应用程序领域的更集中类别。我们的目标是提高应用的可见性和可发现性。具体来说,我们使用词嵌入来生成应用程序描述的数字语义表示。然后对这些表示进行分类,以生成更具凝聚力的应用程序类别。我们的实证调查使用了600个应用程序的数据集,从苹果应用程序商店的教育、健康与健身和医疗类别中抽取样本。结果表明,当使用GloVe(一种基于计数的词嵌入模型)对应用程序描述进行矢量化时,我们的分类算法达到了最佳性能。使用共享经济应用程序的数据集进一步验证了我们的发现,并由12名人类受试者对结果进行了评估。结果表明,GloVe与支持向量机相结合可以产生与人类生成的分类在很大程度上一致的应用程序分类。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信