{"title":"FEATURE ENGINEERING WITH SENTENCE SIMILARITY USING THE LONGEST COMMON SUBSEQUENCE FOR EMAIL CLASSIFICATION","authors":"Aruna Kumara B, M. Kodabagi","doi":"10.22452/mjcs.sp2022no2.6","DOIUrl":"https://doi.org/10.22452/mjcs.sp2022no2.6","url":null,"abstract":"Feature selection plays a prominent role in email classification since selecting the most relevant features enhances the accuracy and performance of the learning classifier. Due to the exponential increase rate in the usage of emails, the classification of such emails posed a fitting problem. Therefore, there is a requirement for a proper classification system. Such an email classification system requires an efficient feature selection method for the accurate classification of the most relevant features. This paper proposes a novel feature selection method for sentence similarity using the longest common subsequence for email classification. The proposed feature selection method works in two main phases: First, it builds the longest common subsequence vector of features by comparing each email with all other emails in the dataset. Later, a template is constructed for each class using the closest features of emails of a particular class. Further, email classification is tested for unseen emails using these templates. The performance of the proposed method is compared with traditional feature selection methods such as TF-IDF, Information Gain, Chi-square, and semantic approach. The experimental results showed that the proposed method performed well with 96.61% accuracy.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47225062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
K. S. A., Priya Nandihal, Seemanthini K, Manjunath D R, L. Liyakathunisa
{"title":"PRIOR DETECTION OF ALZHEIMER’S DISEASE WITH THE AID OF MRI IMAGES AND DEEP NEURAL NETWORKS","authors":"K. S. A., Priya Nandihal, Seemanthini K, Manjunath D R, L. Liyakathunisa","doi":"10.22452/mjcs.sp2022no2.2","DOIUrl":"https://doi.org/10.22452/mjcs.sp2022no2.2","url":null,"abstract":"Alzheimer's disease is a degenerative disease in which brain cells die and deteriorate. It is the most prevalent reason for dementia, which is defined as a progressive decrease in thinking, conduct, and social skills that impairs a person's capacity to operate independently. Although it is fatal the early diagnosis of Alzheimer's can be extremely helpful. Our main aim is to help with the diagnosis of this disease in its early stages using the VGG16 classifier which is a convolutional neural network (CNN) that is 16 layers deep. The dataset consists of MRI images of the brain. Data augmentation is done to significantly increase the diversity of data available and Data pre-processing helps to enhance the overall truthfulness of the proposed approach.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47406381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A NOVEL COMPARATIVE STUDY FOR AUTOMATIC THREE-CLASS AND FOUR-CLASS COVID-19 CLASSIFICATION ON X-RAY IMAGES USING DEEP LEARNING","authors":"H. Yaşar, M. Ceylan","doi":"10.22452/mjcs.vol35no4.5","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no4.5","url":null,"abstract":"The contagiousness rate of the COVID-19 virus, which was evaluated to have been transmitted from an animal to a human during the last months of 2019, is higher than the MERS-Cov and SARS-Cov viruses originating from the same family. The high rate of contagion has caused the COVID-19 virus to spread rapidly to all countries of the world. It is of great importance to be able to detect cases quickly in order to control the spread of the COVID-19 virus. Therefore, the development of systems that make automatic COVID-19 diagnoses using artificial intelligence approaches based on Xray, CT scans, and ultrasound images are an urgent and indispensable requirement. In order to increase the number of X-ray images used within the study, a mixed data set was created by combining eight different data sets, thus maximizing the scope of the study. In the study, a total of 9,667 X ray images were used, including 3,405 of COVID-19 samples, 2,780 of bacterial pneumonia samples, 1,493 of viral pneumonia samples and 1,989 of healthy samples. In this study, which aims to diagnose COVID-19 disease using X-ray images, automatic classification has been performed using two different classification structures: COVID-19 Pneumonia/Other Pneumonia/Healthy and COVID-19 Pneumonia/Bacterial Pneumonia/Viral Pneumonia/Healthy. Convolutional Neural Networks (CNNs), a successful deep learning method, were used as a classifier within the study. A total of seven CNN architectures were used: Mobilenetv2, Resnet101, Googlenet, Xception, Densenet201, Efficientnetb0, and Inceptionv3 architectures. The classification results were obtained from the original X-ray images, and the images were obtained by using Local Binary Pattern and Local Entropy. Then, new classification results were calculated from the obtained results using a pipeline algorithm. Detailed results were obtained to meet the scope of the study. According to the results of the experiments carried out, the three most successful CNN architectures for both three-class and four class automatic classification were Densenet201, Xception, and Inceptionv3, respectively. In addition, it is understood that the pipeline algorithm used in the study is very useful for improving the results. The study results show that up to an improvement of 1.57% were achieved in some comparison parameters.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46550013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"EXPLAINING PHYSIOLOGICAL AFFECT RECOGNITION WITH OPTIMIZED ENSEMBLES OF CLUSTERED EXPLAINABLE MODELS","authors":"W. S. Liew, C. Loo","doi":"10.22452/mjcs.vol35no4.4","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no4.4","url":null,"abstract":"Affect recognition tasks involving physiological signals are difficult to generalize across a large population due to low signal-to-noise ratio and limited data availability. In addition, the use of deep learning models makes it difficult to determine the cause-and-effect between physiological affect and labeled affect. This work addresses the following issues: uneven distribution and noisy data were addressed using K-Means-SMOTE and Fuzzy ART (FA). The clustered hyper-rectangles were extracted from the FA topology and fitted to an Explainable Boosting Machines ensemble using the Easy Ensemble strategy. The hyper parameters of the overall methodology were tuned using genetic algorithms for improved generalization. The proposed method was tested using three publicly available affect recognition datasets: DEAP, DREAMER, and AMIGOS. Step-by-step benchmarks showed that combining techniques achieved good generalization and generated explainable information correlating physiological features to affective labels.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46158866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chetana Krishnan, V. Jeyakumar, Alex Noel Joseph Raj
{"title":"REAL-TIME EYE TRACKING USING HEAT MAPS","authors":"Chetana Krishnan, V. Jeyakumar, Alex Noel Joseph Raj","doi":"10.22452/mjcs.vol35no4.3","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no4.3","url":null,"abstract":"Communication in modern days has developed a lot, including wireless networks, Artificial Intelligence (AI) interaction, and human-computer interfaces. People with paralysis and immobile disorders face daily difficulties communicating with others and gadgets. Eye tracking has proven to promote accessible and accurate interaction compared to other complex automatic interactions. The project aims to develop an electronic eye blinker that integrates with the experimental setup to determine clinical pupil redundancy. The proposed solution comes up with an eye-tracking tool within an inbuilt laptop webcam that tracks the eye’s pupil in the given screen dimensions and generates heat maps on the tracked locations. These heat maps can denote a letter (in case of eye writing), an indication to click on that location (in case of gadget communication), or for blinking analysis. The proposed method achieves a perfect F-measure score of 0.998 to 1.000, which is comparatively more accurate and efficient than the existing technologies. The solution also provides an effective method to determine the eye's refractive error, which can replace the complex refractometers. Further, the spatially tracked coordinates obtained during the experiment can be used to analyze the patient’s blinking pattern, which, in turn, can detect retinal disorders and their progress during medication. One of the applications of the project is to integrate the derived model with a Brain-computer interface system to allow fast communications for the disabled.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43749414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Khan, F. Anwar, Farah Diyana Bt. Abdul Rahman, R. F. Olanrewaju, M. L. Mat Kiah, M. A. Rahman, Z. Janin
{"title":"EXPLORING MANET SECURITY ASPECTS: ANALYSIS OF ATTACKS AND NODE MISBEHAVIOUR ISSUES","authors":"B. Khan, F. Anwar, Farah Diyana Bt. Abdul Rahman, R. F. Olanrewaju, M. L. Mat Kiah, M. A. Rahman, Z. Janin","doi":"10.22452/mjcs.vol35no4.2","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no4.2","url":null,"abstract":"Mobile ad hoc networks are susceptible to various security threats due to their open media nature and mobility, making them a top priority for security measures. This paper provides an in-depth examination of MANET security issues. Some of the most critical aspects of mobile ad hoc networks, including their applications, have been discussed. This is followed by a discussion of MANETs' design vulnerability to external and internal security threats caused by inherent network characteristics such as limited battery power, mobility, dynamic topology, open media, and so on. Numerous MANET-related attacks have been classified based on their sources, behaviour, participating nodes, processing capability, and layering. The many different types of misbehaviour a node can exhibit and the various ways a node can behave were investigated. Two major types of MANETs misbehaviour have been evaluated, classified and analysed. Notably, mitigating node misbehaviour in MANET is a critical issue that must be addressed to ensure network node functionality and availability. Strategies for detecting network nodes that misroute packets are also examined. Finally, the paper emphasises the need for effective solutions to secure MANETs.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44053606","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A NOVEL SCHEDULING APPROACH FOR PILGRIM FLIGHTS OPTIMIZATION PROBLEM","authors":"M. Y. Shambour, Esam A. Khan","doi":"10.22452/mjcs.vol35no4.1","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no4.1","url":null,"abstract":"The main goal of airport administrations around the world is to facilitate the conduct of passenger services and reduce waiting time as much as possible. This can be achieved by regulating the flow of passengers at the various stages of the airport, including arrival and departure halls, passport checkpoints, luggage handling, and customs. This study focuses on improving the flow of passengers in the Hajj terminal at King Abdulaziz International Airport (KAIA) in the Kingdom of Saudi Arabia, as it is one of the most welcoming stations for travelers during the Hajj season and is the fourth largest passenger terminal in the world. Three different optimization algorithms are applied to improve the scheduling process of assigning the arrival flights to available airport gates, as well as the stages inside the various airport lounges and areas. These algorithms are genetic algorithm (GA), harmony search algorithm (HSA), and differential evolution algorithm (DEA). The results give a prior knowledge of how the whole passengers’ arrival process and show the stages that are prone to congestion and cause process delay. Experimental performance results in terms of fitness value and convergence rate show that GA outperforms HSA and DEA when the population size is equal to 5, whereas DEA provides better performance compared to other algorithms when the population size is equal to 20 and 50. Moreover, the results show that the largest waiting time for passengers was in the arrival gate lounges due to the lack of allocated spaces in the passport areas, followed by the luggage area, then the passport control and customs areas, respectively.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49615587","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"GRANULAR NETWORK TRAFFIC CLASSIFICATION FOR STREAMING TRAFFIC USING INCREMENTAL LEARNING AND CLASSIFIER CHAIN","authors":"Faiz Zaki, Firdaus Afifi, A. Gani, N. B. Anuar","doi":"10.22452/mjcs.vol35no3.5","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no3.5","url":null,"abstract":"In modern networks, network visibility is of utmost importance to network operators. Accordingly, granular network traffic classification quickly rises as an essential technology due to its ability to provide high network visibility. Granular network traffic classification categorizes traffic into detailed classes like application names and services. Application names represent parent applications, such as Facebook, while application services are the individual actions within the parent application, such as Facebook-comment. Most studies on granular classification focus on classification at the application name level. Besides that, evaluations in existing studies are also limited and utilize only static and immutable datasets, which are insufficient to reflect the continuous and evolving nature of real-world traffic. Therefore, this paper aims to introduce a granular classification technique, which is evaluated on streaming traffic. The proposed technique implements two Adaptive Random Forest classifiers linked together using a classifier chain to simultaneously produce classification at two granularity levels. Performance evaluation on a streaming testbed setup using Apache Kafka showed that the proposed technique achieved an average F1 score of 99% at the application name level and 88% at the application service level. Additionally, the performance benchmark on ISCX VPN non-VPN public dataset also maintained comparable results, besides recording classification time as low as 2.6 ms per packet. The results conclude that the proposed technique proves its advantage and feasibility for a granular classification in streaming traffic.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47834248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"SELF-ORGANIZING RESERVOIR NETWORK FOR ACTION RECOGNITION","authors":"G. Lee, C. Loo, W. S. Liew","doi":"10.22452/mjcs.vol35no3.4","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no3.4","url":null,"abstract":"Current research in human action recognition (HAR) focuses on efficient and effective modelling of the temporal features of human actions in 3-dimensional space. Echo State Networks (ESNs) are one suitable method for encoding the temporal context due to its short-term memory property. However, the random initialization of the ESN's input and reservoir weights may increase instability and variance in generalization. Inspired by the notion that input-dependent self-organization is decisive for the cortex to adjust the neurons according to the distribution of the inputs, a Self-Organizing Reservoir Network (SORN) is developed based on Adaptive Resonance Theory (ART) and Instantaneous Topological Mapping (ITM) as the clustering process to cater deterministic initialization of the ESN reservoirs in a Convolutional Echo State Network (ConvESN) and yield a Self-Organizing Convolutional Echo State Network (SO-ConvESN). SORN ensures that the activation of ESN’s internal echo state representations reflects similar topological qualities of the input signal which should yield a self-organizing reservoir. In the context of HAR task, human actions encoded as a multivariate time series signals are clustered into clustered node centroids and interconnectivity matrices by SORN for initializing the SO-ConvESN reservoirs. By using several publicly available 3D-skeleton-based action recognition datasets, the impact of vigilance threshold and reservoir perturbation of SORN in performing clustering, the SORN reservoir dynamics and the capability of SO-ConvESN on HAR task have been empirically evaluated and analyzed to produce competitive experimental results.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":"1 1","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41372716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"WDSAE-DNDT BASED SPEECH FLUENCY DISORDER CLASSIFICATION","authors":"S. Pravin, M. Palanivelan","doi":"10.22452/mjcs.vol35no3.3","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no3.3","url":null,"abstract":"In this paper, Weight Decorrelated Stacked Autoencoder-Deep Neural Decision Trees (WDSAE-DNDT), a novel hybrid model is proposed for automating the assessment of children’s speech fluency disorders by discerning their disfluencies. In fluency disorder classification, it is imperative to know how each feature contributes to the disorder classification rather than the diagnosis itself and so the depth modified DNDT acts as the best discriminator since it is interpretable by its very nature. The WDSAE presents DNDT with a high-level latent representation of the disfluent speech. A fusion feature vector was built by combining the prosodic cues from disfluent speech segments combined with the WDSAE-based Bottleneck features. The proposed hybrid model was compared with the performance of the experimented baseline models. Further analysis was carried out to check the impact of tree cut points for each feature and epochs on the accuracy of prediction of the hybrid model. The proposed hybrid model when trained on the fusion feature set has shown appreciable improvement in the area under the Receiver Operating Characteristics (ROC) curve, classification accuracy, Kappa statistical value, and Jaccard similarity index. The WDSAE-DNDT demonstrates high precision than the baseline models in setting clinical benchmark to distinguish subjects with dysphemia from those with Specific Language Impairment.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47993517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}