Using Machine Learning to Predict Ranking of Webpages in the Gift Industry: Factors for Search-Engine Optimization

Joni O. Salminen, Juan Corporan, Roope Marttila, Tommi Salenius, B. Jansen
{"title":"Using Machine Learning to Predict Ranking of Webpages in the Gift Industry: Factors for Search-Engine Optimization","authors":"Joni O. Salminen, Juan Corporan, Roope Marttila, Tommi Salenius, B. Jansen","doi":"10.1145/3361570.3361578","DOIUrl":null,"url":null,"abstract":"We use machine learning to predict the search engine rank of webpages. We use a list of keywords for 30 content blogs of an e-commerce company in the gift industry to retrieve 733 content pages occupying the first-page Google rankings and predict their rank using 30 ranking factors. We test two models, Light Gradient Boosting Machine (LightGBM) and Extreme Gradient Boosted Decision Trees (XGBoost), finding that XGBoost performs better for predicting actual search rankings, with an average accuracy of 0.86. The feature analysis shows the most impactful features are (a) internal and external links, (b) security of the web domain, and (c) length of H3 headings, and the least impactful features are (a) keyword mentioned in domain address, (b) keyword mentioned in the H1 headings, and (c) overall number of keyword mentions in the text. The results highlight the persistent importance of links in search-engine optimization. We provide actionable insights for online marketers and content creators.","PeriodicalId":414028,"journal":{"name":"Proceedings of the 9th International Conference on Information Systems and Technologies","volume":"54 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Conference on Information Systems and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3361570.3361578","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

We use machine learning to predict the search engine rank of webpages. We use a list of keywords for 30 content blogs of an e-commerce company in the gift industry to retrieve 733 content pages occupying the first-page Google rankings and predict their rank using 30 ranking factors. We test two models, Light Gradient Boosting Machine (LightGBM) and Extreme Gradient Boosted Decision Trees (XGBoost), finding that XGBoost performs better for predicting actual search rankings, with an average accuracy of 0.86. The feature analysis shows the most impactful features are (a) internal and external links, (b) security of the web domain, and (c) length of H3 headings, and the least impactful features are (a) keyword mentioned in domain address, (b) keyword mentioned in the H1 headings, and (c) overall number of keyword mentions in the text. The results highlight the persistent importance of links in search-engine optimization. We provide actionable insights for online marketers and content creators.
使用机器学习预测礼品行业网页排名:搜索引擎优化的因素
我们使用机器学习来预测网页的搜索引擎排名。我们使用礼品行业电子商务公司的30个内容博客的关键字列表来检索占据Google排名第一页的733个内容页面,并使用30个排名因素预测它们的排名。我们测试了两个模型,Light Gradient Boosting Machine (LightGBM)和Extreme Gradient Boosting Decision Trees (XGBoost),发现XGBoost在预测实际搜索排名方面表现更好,平均准确率为0.86。特征分析显示,影响最大的特征是(a)内部和外部链接,(b) web域的安全性,(c) H3标题的长度,影响最小的特征是(a)域名地址中提到的关键字,(b) H1标题中提到的关键字,(c)文本中提到的关键字总数。研究结果强调了链接在搜索引擎优化中的持续重要性。我们为在线营销人员和内容创作者提供可操作的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信