Xuan Tho Dang, Duong Hung Bui, Thi Hong Nguyen, T. Nguyen, D. Tran
{"title":"Prediction of Autism-Related Genes Using a New Clustering-Based Under-Sampling Method","authors":"Xuan Tho Dang, Duong Hung Bui, Thi Hong Nguyen, T. Nguyen, D. Tran","doi":"10.1109/KSE.2019.8919377","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919377","url":null,"abstract":"Autism is one of the neurological disorders that occurs in children. There are many causes of autism, one of which is genetic factors. Therefore, in order to find effective treatments, we need to discover the genes which relate to autism disease. In this paper, we use a computational approach to train a model that can predict new autism-related candidate genes. The methodology combines different data sources such as protein-protein interaction networks, microRNAs (miRNA)-target network and known autism-related genes into an integrated network. The structural properties of this network are represented as a vector dataset and a binary classification problem is formulated. However, because the number of known autism-related genes is very small, we face an imbalance data classification problem. To solve this issue, an under-sampling clustering-based data balancing algorithm has been proposed. Training classifiers with machine learning models such as SVMs, k-NN, and RFs, we obtained results of 1-3% higher in G-mean measures when comparing to cases without using any data balancing strategies. These results implied that our proposed model may contribute to finding new autism-related gene candidates.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124868345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"KSE 2019 Conference Proceedings","authors":"","doi":"10.1109/kse.2019.8919329","DOIUrl":"https://doi.org/10.1109/kse.2019.8919329","url":null,"abstract":"","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127359643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Personalized PageRank Based Feature Selection for High-dimension Data","authors":"Zhibo Zhu, Qinke Peng, Xinyu Guan","doi":"10.1109/KSE.2019.8919274","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919274","url":null,"abstract":"Feature selection is critical of data mining applications, especially for extracting valuable information from high-dimension data. It not only improves the performance of learning models, but also enhances the interpretability and generality of knowledge. In this paper, we propose a feature selection method based on the personalized PageRank. Derived from mutual information, a non-symmetrical metric is used to build a feature redundancy network firstly, in which nodes are features and directed edges represent the redundancy relation between features. Then, we compute the personalized PageRank on the network and assign a score for each feature as the redundancy measure given a specific feature subset. Finally, this redundancy integrates into the generalized MRMR framework to achieve the feature selection task. Due to the global characteristics of network and PageRank, our method can provide a better measure of the high-order relationship between the candidate feature and the subset of selected features. Extensive experiments conducted on five microarray datasets verify the effectiveness of the proposed method which outperforming popular benchmarks.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126631438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting Query Difficulty in IR: Impact of Difficulty Definition","authors":"J. Mothe, Léa Laporte, Adrian-Gabriel Chifu","doi":"10.1109/KSE.2019.8919433","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919433","url":null,"abstract":"While it exists information on about any topic on the web, we know from information retrieval (IR) evaluation programs that search systems fail to answer to some queries in an effective manner. System failure is associated to query difficulty in the IR literature. However, there is no clear definition of query difficulty. This paper investigates several ways of defining query difficulty and analyses the impact of these definitions on query difficulty prediction results. Our experiments show that the most stable definition across collections is a threshold-based definition of query difficulty classes.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115293250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. P. T. Thu, B. N. Thai, V. H. Nguyen, Quoc Truong Do, Luong Chi Mai, Huyen Thi Minh Nguyen
{"title":"Recovering Capitalization for Automatic Speech Recognition of Vietnamese using Transformer and Chunk Merging","authors":"H. P. T. Thu, B. N. Thai, V. H. Nguyen, Quoc Truong Do, Luong Chi Mai, Huyen Thi Minh Nguyen","doi":"10.1109/KSE.2019.8919342","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919342","url":null,"abstract":"In the last few years, Automatic Speech Recognition (ASR) systems for Vietnamese are utilized in various applications with exceptional results. Nevertheless, such ASR output still contains limitations such as the absence of punctuation, capitalization and standardize numeric data. These shortcomings cause difficulties for readers to understand context efficiently and for Natural Language Processing (NLP) tasks to be well-performed. Capitalization is one of the most critical factors to enhance human readability, parsing, and Named Entity Recognition (NER). Additionally, Vietnamese ASR output has its own features comparing to English such as lisp words, local words, compound words, and homophone. In this paper, we propose a method to Recover Capitalization for long-speech ASR transcription of Vietnamese using Transformer models and chunk merging. Furthermore, we perform decoding in parallel while improving the prediction accuracy.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130224468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Automatic blotch removal using a perceptual approach","authors":"Nguyen thi Quynh Hoa, Dang Thanh Trung, Azeddine Beghdadi, Heyfa Ammar, Amel Benazza","doi":"10.1109/KSE.2019.8919321","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919321","url":null,"abstract":"One of important issue needs attention in the area of old film storage and restoration is blotch handling. In this work, a new proposal including an automatic detection method and inpainting scheme is introduced. First, a technique for automatically detecting blotches based on local changes of pixels on consecutive frames is applied. Specifically, a two-stage Simplified Ranked Order Difference detector is proposed to identify blotches on frames. Next, an improved inpainting was applied to restore the blotches so that it is undetectable by viewers. Our proposal is executed automatically without extenal parameters. The proposal has been tested on a serial of natural images with different sizes and resolutions. Experimental results show that our approach has successfully removed with fairly high accuracy and quite smooth restored blotches. Based on result analysis, the proposal has many potential and applications in the future. Index Terms—blotch detection, blotch removal, restoration, inpainting","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116953866","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raphaëlle Bour, C. Soulé-Dupuy, N. Vallès-Parlangeau
{"title":"DEMOS: A Design Method for demOcratic information System","authors":"Raphaëlle Bour, C. Soulé-Dupuy, N. Vallès-Parlangeau","doi":"10.1109/KSE.2019.8919427","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919427","url":null,"abstract":"This paper presents a method to support the design and the implementation of democratic information system in organizations. Ethical principle of democracy is today a challenge to propose a solution to the major issue of Shadow IT especially. Our method called DEMOS focuses on end-users’ viewpoint concept to propose a participative and collaborative approach for information system co-construction. It combines different participative tools such as photolanguage, mind map or User Story writing. This article presents strategies and intentions of DEMOS process with the MAP formalism and proposes a detailed description of DEMOS meta-model and key concepts. We conducted an experiment in 2018 with lifelong training service at the University Toulouse 1. A qualitative study evaluates the effectiveness of DEMOS.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124398506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Effective Text Data Preprocessing Technique for Sentiment Analysis in Social Media Data","authors":"Saurav Pradha, M. Halgamuge, N. Q. Vinh","doi":"10.1109/KSE.2019.8919368","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919368","url":null,"abstract":"In the big data era, data is made in real-time or closer to real-time. Thus, businesses can utilize this evergrowing volume of data for the data-driven or information-driven decision-making process to improve their businesses. Social media, like Twitter, generates an enormous amount of such data. However, social media data are often unstructured and difficult to manage. Hence, this study proposes an effective text data preprocessing technique and develop an algorithm to train the Support Vector Machine (SVM), Deep Learning (DL) and Naïve Bayes (NB) classifiers to process Twitter data. We develop an algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text. In this study, we (i) compare different preprocessing techniques on the data collected from Twitter using various techniques such as (stemming, lemmatization and spelling correction) to obtain the efficient method (ii) develop an algorithm to weight the scores of the hashtag and cleaned text to obtain the sentiment. We retrieved N=1,314,000 Twitter data, and we compared the popularity of two products, Google Now and Amazon Alexa. Using our data preprocessing algorithm and sentiment weight score algorithm, we train SVM, DL, NB models. The results show that stemming technique performed best in terms of computational speed. Additionally, the accuracy of the algorithm was tested against manually sorted sentiments and sentiments produced before text data preprocessing. The result demonstrated that the impact produced by the algorithm was close to the manually annotated sentiments. In terms of model performance, the SVM performed better with the accuracy of 90.3%, perhaps, due to the unstructured nature of Twitter data. Previous studies used conventional techniques; hence, no precise methods were utilized on cleaning the text. Therefore, our approach confirms that proper text data preprocessing technique plays a significant role in the prediction accuracy and computational time of the classifier when using the unstructured Twitter data.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134092690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A solution to ethical and legal problem with the decision-making model of autonomous vehicles","authors":"Quach Hai Tho, Huynh Cong Phap, Pham Anh Phuong","doi":"10.1109/KSE.2019.8919452","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919452","url":null,"abstract":"With the development of autonomous vehicles technology, the ability to avoid appropriate obstacles to ensure safety and improve the effectiveness on transportation has been increased. However, not all collisions can be avoided and autonomous vehicles play as an identification of making ethical and legal decisions in emergency situations. In this article, the problem will be presented regarding the ethical and legal issues included in the decision-making model in an emergency situation, with indicators related to the operating environment considered to be indicators of cooperation in the car navigation behavior and the model's input variable to be built.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114178716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"KSE 2019 Executive Committee","authors":"","doi":"10.1109/kse.2019.8919298","DOIUrl":"https://doi.org/10.1109/kse.2019.8919298","url":null,"abstract":"","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127446152","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}