Malaysian Journal of Computer Science最新文献_第6页

IMPROVING MULTI-LABEL TEXT CLASSIFICATION USING WEIGHTED INFORMATION GAIN AND CO-TRAINED MULTINOMIAL NAÏVE BAYES CLASSIFIER 利用加权信息增益和联合训练的多项式NA-VE-BAYES分类器改进多标签文本分类

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2022-01-31 DOI: 10.22452/mjcs.vol35no1.2

W. Kaur, Vimala Balakrishnan, Kok-Seng Wong

{"title":"IMPROVING MULTI-LABEL TEXT CLASSIFICATION USING WEIGHTED INFORMATION GAIN AND CO-TRAINED MULTINOMIAL NAÏVE BAYES CLASSIFIER","authors":"W. Kaur, Vimala Balakrishnan, Kok-Seng Wong","doi":"10.22452/mjcs.vol35no1.2","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no1.2","url":null,"abstract":"Over recent years, the emergence of electronic text processing systems has generated a vast amount of structured and unstructured data, thus creating a challenging situation for users to rummage through irrelevant information. Therefore, studies are continually looking to improve the classification process to produce more accurate results that would benefit users. This paper looks into the weighted information gain method that re-assigns wrongly classified features with new weights to provide better classification. The method focuses on the weights of the frequency bins, assuming every time a certain word frequency bin is iterated, it provides information on the target word feature. Therefore, the more iteration and re-assigning of weight occur within the bin, the more important the bin becomes, eventually providing better classification. The proposed algorithm was trained and tested using a corpus extracted from dedicated Facebook pages related to diabetes. The weighted information gain feature selection technique is then fed into a co-trained Multinomial Naïve Bayes classification algorithm that captures the labels' dependencies. The algorithm incorporates class value dependencies since the dataset used multi-label data before converting string vectors that allow the sparse distribution between features to be minimised, thus producing more accurate results. The results of this study show an improvement in classification to 61%.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46739090","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

AN EFFICIENT SENTIMENT ANALYSIS BASED DEEP LEARNING CLASSIFICATION MODEL TO EVALUATE TREATMENT QUALITY 一种有效的基于情绪分析的深度学习分类模型用于评估治疗质量

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2022-01-31 DOI: 10.22452/mjcs.vol35no1.1

Samer Abdulateef Waheeb, Naseer Ahmed Khan, Xuequn Shang

{"title":"AN EFFICIENT SENTIMENT ANALYSIS BASED DEEP LEARNING CLASSIFICATION MODEL TO EVALUATE TREATMENT QUALITY","authors":"Samer Abdulateef Waheeb, Naseer Ahmed Khan, Xuequn Shang","doi":"10.22452/mjcs.vol35no1.1","DOIUrl":"https://doi.org/10.22452/mjcs.vol35no1.1","url":null,"abstract":"Extracting information using an automated system from unstructured medical documents related to patients discharge summaries in the health care centers is considered a big challenge. Sentiment analysis of medical records has gained significant attention worldwide to understand the behaviors of both clinicians and patients. However, Sentiment analysis of discharge summary still does not provide a clear picture of the information available in these summaries. This study proposes a machine learning-based novel sentiment analysis unsupervised techniques to classify discharge summaries using TF-IDF, Word2Vec, GloVe, FastText, and BERT as deep learning approaches with statistical methods, and clustering. Our proposed model is an unsupervised sentiment framework that provides good understanding and insights of the clinical features that are not captured in the electronic health data records. Moreover, it’s a hybrid sentiment model consisting of clustering technique and vector space models for selecting the distinctive terms. The main intensity of measured sentiment is captured using the polarity of positive and negative terms in the discharge summary. The combination of SentiWordNet platform and our approach is used to build a lexicon sentiment dataset (assignment polarity). Experiments shows that our suggested method achieves 93% accuracy and significantly outperforms other state of the art approaches based on the inspiration of sentiment analysis technique to examine the treatment quality for discharge summaries.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2022-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43429501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

AUTOMATED ARABIC ESSAY SCORING BASED ON HYBRID STEMMING WITH WORDNET 基于混合词干和WORDNET的阿拉伯语作文自动评分

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.4

Mohammad Alobed, Abdallah M M Altrad, Zainab Binti Abu Bakar, N. Zamin

{"title":"AUTOMATED ARABIC ESSAY SCORING BASED ON HYBRID STEMMING WITH WORDNET","authors":"Mohammad Alobed, Abdallah M M Altrad, Zainab Binti Abu Bakar, N. Zamin","doi":"10.22452/mjcs.sp2021no2.4","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.4","url":null,"abstract":"Schools, universities, and other educational institutions have been forced to close their doors because of the coronavirus outbreak. E-learning has become an option and has long been discussed about the need to integrate it into the educational process-learning uses a variety of evaluation methods, one of which is the essay. This research introduces a new model for Arabic Automated Essay Grading (AAEG) that has been developed to reduce human bias mistakes and costs while saving time. However, (AAEG) is still in its infancy. The model relies on new hybrid stemming with Arabic WordNet (AWN). The primary goal of stemming is reducing inflectional forms of words to root words. The hybrid method is based on different techniques: Extended Light Stemmer, ISRI, and looking at tables (AWN). Data used in this study consists of 3050 words with their roots were retrieved from (AWN) and then stemmed using algorithms (Light10, ISRI, Hybrid...). For evaluation, the metrics used were accuracy, precision, recall, and F1-score. While comparing the performance of the different stemming algorithms, the hybrid stemming method had the greatest results, therefore the (AAEG) will improve with Hybrid Stemming.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43206632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

IDENTIFYING THE ETHICAL ISSUES IN TWITTER: A KNOWLEDGE ACQUISITION FOR ONTOLOGY 识别TWITTER中的伦理问题：本体论的知识获取

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.7

Mohamad Hafizuddin Mohamed Najid, Z. Zulkifli, R. Othman, Rohaiza Rokis, A. A. Salahuddin

{"title":"IDENTIFYING THE ETHICAL ISSUES IN TWITTER: A KNOWLEDGE ACQUISITION FOR ONTOLOGY","authors":"Mohamad Hafizuddin Mohamed Najid, Z. Zulkifli, R. Othman, Rohaiza Rokis, A. A. Salahuddin","doi":"10.22452/mjcs.sp2021no2.7","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.7","url":null,"abstract":"Social media is an open platform to communicate, share and exchange information freely. This uncontrolled exchanged information carries out both negative and positive impacts in others’ lives. In this regard, this study aims to identify ethical issues on this information in line with Ibn Khaldun’s ethical considerations. Out of many other social networking sites, Twitter has been identified as one of the most popular microblogging social networking platforms. Using a simple algorithm in R programming and 43 keywords based on Ibn Khaldun’s thoughts, 1075 public tweets have been extracted from Twitter as a sample of ethical issues. The sentiment analysis in Parallel Dots was performed on the collected tweets, and it was discovered that 700 of the tweets are positive statements, 229 are neutral statements, and 146 are negative statements. Having done the validation process on these sentiments, the study proposed these identified ethical issues from tweets as a domain in developing ontology relationships with Ibn Khaldun’s thoughts. In this process, further study can be carried on with wider data from various sources beyond the limitation of this study. Thus, a semantic database could serve as a guideline for SNS ethical issues based on Ibn Khaldun’s thoughts.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47646328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MAPPING DEFORESTATION IN PERMANENT FOREST RESERVE OF PENINSULAR MALAYSIA WITH MULTI-TEMPORAL SAR IMAGERY AND U-NET BASED SEMANTIC SEGMENTATION 基于多时间SAR图像和U-NET语义分割的马来西亚半岛永久性森林保护区森林砍伐地图绘制

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.2

Muhammad Azzam A. Wahab, Ely Salwana Mat Surin, Norshita Mat Nayan, Hameedur Rahman

{"title":"MAPPING DEFORESTATION IN PERMANENT FOREST RESERVE OF PENINSULAR MALAYSIA WITH MULTI-TEMPORAL SAR IMAGERY AND U-NET BASED SEMANTIC SEGMENTATION","authors":"Muhammad Azzam A. Wahab, Ely Salwana Mat Surin, Norshita Mat Nayan, Hameedur Rahman","doi":"10.22452/mjcs.sp2021no2.2","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.2","url":null,"abstract":"Deforestation is the long-term or permanent conversion of forest land to other uses, such as agriculture, mining, and urban development. As a result, deforestation has catastrophic consequences for the environment, including the loss of biodiversity, disruption of clean water supplies, and the acceleration of climate change. According to statistics, the deforestation trend in developing countries is at an alarming rate including Malaysia where plantation activities are the primary cause of forest loss. Recent anecdotal studies have demonstrated the effectiveness of the deep learning-based (DL) approach in producing deforestation maps. However, there are limited studies concentrating on DL approach for synthetic aperture radar (SAR) imaging due to complexity of the computational concepts of the method. The SAR imagery can be challenging to interpret but its all-weather and all-day capability can be critical in forest monitoring compared to optical imagery. Thus, in this study, we propose to map deforestation areas in Permanent Forest Reserve (HSK) using multi-temporal Sentinel-1 SAR data. Deep learning-based U-Net was employed to classify the SAR imagery as forest and non-forest due to its semantic segmentation capabilities. The experiment results showed that the proposed deep learning-based technique successfully achieved 0.993 of intersection over union (IoU) and 0.980 of overall accuracy (OA). Also, we explained the entire procedure from beginning to end as simple as possible for beginners to comprehend. In brief, the findings of this study have the potential to improve monitoring of damaged HSK areas, prioritize the restoration of the affected forest areas and protecting the forest lands from illegal deforestation activities.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45357013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

AN EXPERIMENTAL EVALUATION OF DEEP NEURAL NETWORK MODEL PERFORMANCE FOR THE RECOGNITION OF CONTRADICTORY MEDICAL RESEARCH CLAIMS USING SMALL AND MEDIUM-SIZED CORPORA 利用中小型语料库识别矛盾医学研究主张的深度神经网络模型性能实验评价

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.5

Fatin Syafiqah Yazi, Wan-Tze Vong, V. Raman, Patrick Hang Hui Then, Mukulraj J Lunia

{"title":"AN EXPERIMENTAL EVALUATION OF DEEP NEURAL NETWORK MODEL PERFORMANCE FOR THE RECOGNITION OF CONTRADICTORY MEDICAL RESEARCH CLAIMS USING SMALL AND MEDIUM-SIZED CORPORA","authors":"Fatin Syafiqah Yazi, Wan-Tze Vong, V. Raman, Patrick Hang Hui Then, Mukulraj J Lunia","doi":"10.22452/mjcs.sp2021no2.5","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.5","url":null,"abstract":"Corpora come in various shapes and sizes and play an essential role in facilitating Natural Language Processing (NLP) tasks. However, the availability of corpora specialized for Evidence-Based Medicine (EBM) related tasks is limited. The study is aimed to discover how the size of a corpus influence the performance of our Deep Neural Network (DNN) model developed for contradiction detection in medical literature. We explored the potential of the EBM Summarizer corpus by Mollá and Santiago-Martínez, a medium-sized corpus to be used with our contradiction detection model. The dataset preparation involves the filtering of open-ended questions, duplicates of claims, and vague claims. As a result, two datasets were created with the claim input represented by sniptext in one dataset and longtext in the other. Experiments were conducted with varying numbers of hidden layers and units of the model using different datasets. The performance of the DNN model was recorded and compared with the result of using a small-sized corpus. It was found that the DNN model performance did not improve even after it was trained with a larger dataset derived from the medium-sized corpus. The factors may include the limitation of the DNN model itself and the quality of the datasets.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44865159","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

SEMANTIC GRAPH KNOWLEDGE REPRESENTATION FOR AL-QURAN VERSES BASED ON WORD DEPENDENCIES 基于词相关性的AL-QURAN诗句语义图知识表示

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.9

Muhammad Muhtadi Mohamad Khazani, H. Mohamed, Tengku Mohd Tengku Sembok, Nurhafizah Moziyana Mohd Yusop, Sharyar Wani, Yonis Gulzar, Mohd Hazali Mohamed Halip, Syahaneim Marzukh, Zahri Yunos

{"title":"SEMANTIC GRAPH KNOWLEDGE REPRESENTATION FOR AL-QURAN VERSES BASED ON WORD DEPENDENCIES","authors":"Muhammad Muhtadi Mohamad Khazani, H. Mohamed, Tengku Mohd Tengku Sembok, Nurhafizah Moziyana Mohd Yusop, Sharyar Wani, Yonis Gulzar, Mohd Hazali Mohamed Halip, Syahaneim Marzukh, Zahri Yunos","doi":"10.22452/mjcs.sp2021no2.9","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.9","url":null,"abstract":"Semantic approaches present an efficient, detailed and easily understandable representation of knowledge from documents. Al-Quran contains a vast amount of knowledge that needs appropriate knowledge extraction. A semantic based approach can help in designing an efficient and explainable knowledge representation model for Al-Quran. This research aims to propose a semantic-graph knowledge representation model for verses of Al-Quran based on word dependencies. These features are used in the proposed knowledge representation model allowing the semantic graph matching to improve Al-Quran search applications' accuracy. The proposed knowledge representation model is essentially a formalism for generating a semantic graph representation of Quranic verses, which can be applied for knowledge base construction for other applications such as information retrieval system. A set of rules called Semantic Dependency Triple Rules are defined to be mapped into the semantic graph representing the verse's logic. The rules translate word dependencies and other NLP metadata into a triple form that holds logical information. The proposed model has been tested with English translation of Al-Quran on a document retrieval prototype The basic system has been enhanced with anaphoric pronouns correction, which has shown improvement in retrieval performance. The results have been compared with a closely related system and evaluated on the accuracy of the document retrieval in Precision, Recall and F-score measurements. The proposed model has achieved 65%, 60% and 62.4% for the measurements, respectively. It has also improved the overall accuracy of previous system by 43.8%.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43309466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

MELODY TRAINING WITH SEGMENT-BASED TILT CONTOUR FOR QURANIC TARANNUM 基于分段倾斜轮廓的曲美塔兰农旋律训练

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.1

Haslizatul Mohamed Hanum, Luqmanul Hakim Md Abas, Aiman Syamil Aziz, Zainab Abu Bakar, Norizan Mat Diah, W. F. Wan Ahmad, Nazlena Mohamad Ali, N. Zamin

{"title":"MELODY TRAINING WITH SEGMENT-BASED TILT CONTOUR FOR QURANIC TARANNUM","authors":"Haslizatul Mohamed Hanum, Luqmanul Hakim Md Abas, Aiman Syamil Aziz, Zainab Abu Bakar, Norizan Mat Diah, W. F. Wan Ahmad, Nazlena Mohamad Ali, N. Zamin","doi":"10.22452/mjcs.sp2021no2.1","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.1","url":null,"abstract":"Tarannum, or melodic recitation of Quranic verses, employs the softness of the voice in reading the holy verses of the Quran. Melody training technology allows users to practise repetitively while also providing feedback on their performance. This paper describes an application that captures the pattern of tarannum melodies (from Quranic recitations) and provides feedback to the user. Recordings of Quranic verses are collected from an expert reciting Bayati tarannum. The samples are pre-processed into segmented tarannum verse-contours using pitch sequences. Using the k-Nearest Neighbor (kNN) classifier, the melody patterns are trained on 20 samples. Input vectors are formed by computing the melody verse-contour representation using mean, standard deviation, and slope values and combining them with an identified Tilt-based contour label. A tarannum training prototype is built to test similarity between a user’s recitation and the trained patterns. To identify similarity between a pair of verse-contours, the application employs a shape-based contour similarity algorithm. The proposed application also provides feedback in the form of a grade and a percentage of accuracy, as determined by a melody curve similarity algorithm. As results, the current samples have an overall shape-based weighted score of 66%. Some samples are successfully classified with a similarity score as high as 80% individually. The study provides an alternative interactive session for people who want to learn Tarannum, as well as a preliminary step toward understanding the melodic patterns for tarannum. The application provides a repetitive training experience and encourages users to improve their recitations in order to achieve the highest possible score.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45713079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

HYBRID DISCRETE WAVELET TRANSFORM AND TEXTURE ANALYSIS METHODS FOR FEATURE EXTRACTION AND CLASSIFICATION OF BREAST DYNAMIC THERMOGRAM SEQUENCES 离散小波变换与纹理分析相结合的乳腺动态热图序列特征提取与分类方法

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.8

Khaleel Al-Rababah, M. Mustaffa, S. Doraisamy, F. Khalid, Luís Filipe de Pina Júnior

{"title":"HYBRID DISCRETE WAVELET TRANSFORM AND TEXTURE ANALYSIS METHODS FOR FEATURE EXTRACTION AND CLASSIFICATION OF BREAST DYNAMIC THERMOGRAM SEQUENCES","authors":"Khaleel Al-Rababah, M. Mustaffa, S. Doraisamy, F. Khalid, Luís Filipe de Pina Júnior","doi":"10.22452/mjcs.sp2021no2.8","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.8","url":null,"abstract":"Breast cancer is a common cancer that hits women causing thousands of casualties every year. A cancerous tumor causes an increase of temperature near the region of the tumor. The heat generated by the temperature transferred to the skin surface. The temperature in the tumor area is warmer than in the healthy area. Detecting breast cancer in early stages can save women’s lives and lower the burden on the cost. Thermography is an imaging technique used for breast cancer detection. A dynamic thermography technique which is used to generate infrared images over a fixed time measured in minutes to detect the difference between the normal and cancerous areas in images. In this research, we propose a methodology to deal with the changes of temperature in patient's breasts by defining a set of efficient features resulted from extraction and reduction of coefficients obtained from breast thermogram images followed by classification. Texture feature methods (Histogram of Oriented Gradients (HOG) and Discrete Curvelet transform) are applied separately using the HH (high-high) and HL (high-low) sub band images of Discrete Wavelet transform (DWT). HOG-based features and Curvelet features are extracted by reducing coefficients’ vectors returned by the two methods. Finally, Support Vector Machine (SVM) binary classifier is used to classify the images to either normal or abnormal. The proposed work has successfully achieved an Accuracy of 98.2%, Sensitivity of 97.7%, and Specificity of 98.2% through empirical studies using dynamic breast thermogram dataset.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45199726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH 利用机器学习方法实现超参数优化和超采样检测网络欺凌

IF 0.6 4区计算机科学

Malaysian Journal of Computer Science Pub Date : 2021-12-31 DOI: 10.22452/mjcs.sp2021no2.6

Wan Noor Anira Wan Ali, M. Mohd, F. Fauzi, Kiyoaki Shirai, Muhammad Junaidi Mahamad Noor

{"title":"IMPLEMENTATION OF HYPERPARAMETER OPTIMISATION AND OVER-SAMPLING IN DETECTING CYBERBULLYING USING MACHINE LEARNING APPROACH","authors":"Wan Noor Anira Wan Ali, M. Mohd, F. Fauzi, Kiyoaki Shirai, Muhammad Junaidi Mahamad Noor","doi":"10.22452/mjcs.sp2021no2.6","DOIUrl":"https://doi.org/10.22452/mjcs.sp2021no2.6","url":null,"abstract":"Online social networks have become a necessity to everyone around the world. Particularly, online social networks have enabled us to connect to one another regardless of time, for as long as we have social media and social networking as platforms for broadcasting information and communicating, respectively. However, this evolution has resulted in people possibly committing various cybercrimes, such as cyberbullying. To address this issue, machine learning can be utilised to counter cyberbullying in online social networks. Thus, this study proposed a framework with a set of features consisting of word and character term frequency–inverse document frequency and word embedding by using Word2vec and six types of list terms: profane words, proper nouns, negation words, ‘allness’ term, diminisher words and intensifier words. These features were divided into four groups before being fed into the linear support vector classifier to train our model using ASKfm as data set in hyperparameter tuning and over-sampling environment. Results indicated that the proposed framework provided significant outcomes, in which the highest percentage of area under curve is 99.24% and F-measure is 97.38% as performed by our trained model.","PeriodicalId":49894,"journal":{"name":"Malaysian Journal of Computer Science","volume":" ","pages":""},"PeriodicalIF":0.6,"publicationDate":"2021-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47125759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2