基于菲律宾文本的支持向量机和朴素贝叶斯分析客户满意度

Q3 Social Sciences

WSEAS Transactions on Environment and Development Pub Date : 2023-06-06 DOI:10.37394/232015.2023.19.50

Joseph B. Campit

{"title":"基于菲律宾文本的支持向量机和朴素贝叶斯分析客户满意度","authors":"Joseph B. Campit","doi":"10.37394/232015.2023.19.50","DOIUrl":null,"url":null,"abstract":"The study aimed to compare the classification performance of Support Vector Machine (SVM) and Naive Bayes (NB) machine learning models for estimating customer satisfaction utilizing Filipino text. Specifically, it analyzed the characteristics of the customer satisfaction data. It also examined the impact of different model configurations, including n-gram, stop words, and stemming, on the classification performance of the two models. The research employed qualitative and quantitative methods, utilizing text analytics and sentiment analysis to extract and analyze valuable information from unstructured responses from a satisfaction survey of the University President’s leadership performance conducted among PSU personnel and students. The dataset comprised 56,000 Filipino and English-word responses, manually annotated and randomly split into training and testing datasets. The study followed a general framework encompassing data pre-processing, modeling, and model comparison. To validate the classifiers’ classification performance, a 10-fold cross-validation approach was employed. The findings revealed that most personnel and students expressed positive sentiment toward the University President’s leadership performance. SVM outperformed the NB model across all different model configurations. With both stop word removal and stemming, the SVM trigram model achieved the highest classification performance for estimating customer satisfaction, using 75% of the data for training and 25% for testing. The proposed model holds the potential for estimating customer satisfaction using other unstructured customer satisfaction data utilizing Filipino text.","PeriodicalId":53713,"journal":{"name":"WSEAS Transactions on Environment and Development","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Analyzing Customer Satisfaction using Support Vector Machine and Naive Bayes Utilizing Filipino Text\",\"authors\":\"Joseph B. Campit\",\"doi\":\"10.37394/232015.2023.19.50\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The study aimed to compare the classification performance of Support Vector Machine (SVM) and Naive Bayes (NB) machine learning models for estimating customer satisfaction utilizing Filipino text. Specifically, it analyzed the characteristics of the customer satisfaction data. It also examined the impact of different model configurations, including n-gram, stop words, and stemming, on the classification performance of the two models. The research employed qualitative and quantitative methods, utilizing text analytics and sentiment analysis to extract and analyze valuable information from unstructured responses from a satisfaction survey of the University President’s leadership performance conducted among PSU personnel and students. The dataset comprised 56,000 Filipino and English-word responses, manually annotated and randomly split into training and testing datasets. The study followed a general framework encompassing data pre-processing, modeling, and model comparison. To validate the classifiers’ classification performance, a 10-fold cross-validation approach was employed. The findings revealed that most personnel and students expressed positive sentiment toward the University President’s leadership performance. SVM outperformed the NB model across all different model configurations. With both stop word removal and stemming, the SVM trigram model achieved the highest classification performance for estimating customer satisfaction, using 75% of the data for training and 25% for testing. The proposed model holds the potential for estimating customer satisfaction using other unstructured customer satisfaction data utilizing Filipino text.\",\"PeriodicalId\":53713,\"journal\":{\"name\":\"WSEAS Transactions on Environment and Development\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"WSEAS Transactions on Environment and Development\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.37394/232015.2023.19.50\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Social Sciences\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"WSEAS Transactions on Environment and Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.37394/232015.2023.19.50","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Social Sciences","Score":null,"Total":0}

引用次数: 0

摘要

本研究旨在比较支持向量机（SVM）和朴素贝叶斯（NB）机器学习模型在利用菲律宾文本估计客户满意度方面的分类性能。具体分析了顾客满意度数据的特征。它还研究了不同的模型配置，包括n-gram、停止词和词干，对两个模型的分类性能的影响。这项研究采用了定性和定量的方法，利用文本分析和情绪分析，从对PSU人员和学生进行的大学校长领导表现满意度调查中的非结构化回答中提取和分析有价值的信息。该数据集包括56000个菲律宾语和英语单词回答，手动注释并随机分为训练和测试数据集。该研究遵循了一个包括数据预处理、建模和模型比较的通用框架。为了验证分类器的分类性能，采用了10倍交叉验证方法。调查结果显示，大多数教职员工和学生对校长的领导表现表示积极评价。SVM在所有不同的模型配置中都优于NB模型。在去除停止词和词干的情况下，SVM三元模型在估计客户满意度方面实现了最高的分类性能，使用75%的数据进行训练，使用25%的数据进行测试。所提出的模型具有利用菲律宾文本使用其他非结构化客户满意度数据来估计客户满意度的潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Analyzing Customer Satisfaction using Support Vector Machine and Naive Bayes Utilizing Filipino Text

The study aimed to compare the classification performance of Support Vector Machine (SVM) and Naive Bayes (NB) machine learning models for estimating customer satisfaction utilizing Filipino text. Specifically, it analyzed the characteristics of the customer satisfaction data. It also examined the impact of different model configurations, including n-gram, stop words, and stemming, on the classification performance of the two models. The research employed qualitative and quantitative methods, utilizing text analytics and sentiment analysis to extract and analyze valuable information from unstructured responses from a satisfaction survey of the University President’s leadership performance conducted among PSU personnel and students. The dataset comprised 56,000 Filipino and English-word responses, manually annotated and randomly split into training and testing datasets. The study followed a general framework encompassing data pre-processing, modeling, and model comparison. To validate the classifiers’ classification performance, a 10-fold cross-validation approach was employed. The findings revealed that most personnel and students expressed positive sentiment toward the University President’s leadership performance. SVM outperformed the NB model across all different model configurations. With both stop word removal and stemming, the SVM trigram model achieved the highest classification performance for estimating customer satisfaction, using 75% of the data for training and 25% for testing. The proposed model holds the potential for estimating customer satisfaction using other unstructured customer satisfaction data utilizing Filipino text.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

WSEAS Transactions on Environment and Development Social Sciences-Development

CiteScore

1.90

自引率

0.00%

发文量

118

期刊介绍： WSEAS Transactions on Environment and Development publishes original research papers relating to the studying of environmental sciences. We aim to bring important work to a wide international audience and therefore only publish papers of exceptional scientific value that advance our understanding of these particular areas. The research presented must transcend the limits of case studies, while both experimental and theoretical studies are accepted. It is a multi-disciplinary journal and therefore its content mirrors the diverse interests and approaches of scholars involved with sustainable development, climate change, natural hazards, renewable energy systems and related areas. We also welcome scholarly contributions from officials with government agencies, international agencies, and non-governmental organizations.