{"title":"Orthogonal least square based feature selection for an automatic hate speech detection and classification","authors":"Srinivasulu Kothuru, A. Santhanavijayan","doi":"10.1016/j.compeleceng.2025.110131","DOIUrl":null,"url":null,"abstract":"<div><div>Hate speech in social media is a growing issue nowadays that negatively affects the society and individuals within. Moreover, hate speech detection is a challenging task, due to the vast number of user data generated on a daily basis. This makes it difficult to review each comment made by a user. The major objective of this research is to effectively identify or detect hate speech, and classify the same, using Orthogonal Least Squares (OLS)-based feature selection. The different feature extraction approaches known as Term-Frequency-Inverse Data Frequency (TF-IDF), count vectorizer, global vector (GloVe) model, and aspect features, are used to extract frequent occurrences of the keywords, coverage of keywords and contextual meaning of words. The features chosen by the OLS are classified using Stacked Bidirectional Long Short-Term Memory with Multiple Attention Mechanism (SBiLSTM-MAM) to compute the attention among the hidden states and utterance embeddings. Thus, the developed OLS discards the unwanted features and concentrated on most significant features, that improves classification using SBiLSTM-MAM. Here, the OLS ranks the features according to its orthogonal significance, confirming that the chosen features are less collinear, thus minimizing the noise. From the result analysis, it clearly shows that the precisions gained by the proposed OLS-SBiLSTM-MAM, in the OLID and SOLID datasets, are 80.22% and 84.7% respectively, which are higher when compared to that of existing Softplus BiLSTM.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"123 ","pages":"Article 110131"},"PeriodicalIF":4.0000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0045790625000746","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
Hate speech in social media is a growing issue nowadays that negatively affects the society and individuals within. Moreover, hate speech detection is a challenging task, due to the vast number of user data generated on a daily basis. This makes it difficult to review each comment made by a user. The major objective of this research is to effectively identify or detect hate speech, and classify the same, using Orthogonal Least Squares (OLS)-based feature selection. The different feature extraction approaches known as Term-Frequency-Inverse Data Frequency (TF-IDF), count vectorizer, global vector (GloVe) model, and aspect features, are used to extract frequent occurrences of the keywords, coverage of keywords and contextual meaning of words. The features chosen by the OLS are classified using Stacked Bidirectional Long Short-Term Memory with Multiple Attention Mechanism (SBiLSTM-MAM) to compute the attention among the hidden states and utterance embeddings. Thus, the developed OLS discards the unwanted features and concentrated on most significant features, that improves classification using SBiLSTM-MAM. Here, the OLS ranks the features according to its orthogonal significance, confirming that the chosen features are less collinear, thus minimizing the noise. From the result analysis, it clearly shows that the precisions gained by the proposed OLS-SBiLSTM-MAM, in the OLID and SOLID datasets, are 80.22% and 84.7% respectively, which are higher when compared to that of existing Softplus BiLSTM.
期刊介绍:
The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency.
Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.