2019 11th International Conference on Knowledge and Systems Engineering (KSE)最新文献_第7页

Prediction of Autism-Related Genes Using a New Clustering-Based Under-Sampling Method 基于聚类的欠采样新方法预测自闭症相关基因

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919377

Xuan Tho Dang, Duong Hung Bui, Thi Hong Nguyen, T. Nguyen, D. Tran

{"title":"Prediction of Autism-Related Genes Using a New Clustering-Based Under-Sampling Method","authors":"Xuan Tho Dang, Duong Hung Bui, Thi Hong Nguyen, T. Nguyen, D. Tran","doi":"10.1109/KSE.2019.8919377","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919377","url":null,"abstract":"Autism is one of the neurological disorders that occurs in children. There are many causes of autism, one of which is genetic factors. Therefore, in order to find effective treatments, we need to discover the genes which relate to autism disease. In this paper, we use a computational approach to train a model that can predict new autism-related candidate genes. The methodology combines different data sources such as protein-protein interaction networks, microRNAs (miRNA)-target network and known autism-related genes into an integrated network. The structural properties of this network are represented as a vector dataset and a binary classification problem is formulated. However, because the number of known autism-related genes is very small, we face an imbalance data classification problem. To solve this issue, an under-sampling clustering-based data balancing algorithm has been proposed. Training classifiers with machine learning models such as SVMs, k-NN, and RFs, we obtained results of 1-3% higher in G-mean measures when comparing to cases without using any data balancing strategies. These results implied that our proposed model may contribute to finding new autism-related gene candidates.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124868345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

KSE 2019 Conference Proceedings KSE 2019会议记录

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/kse.2019.8919329

引用次数: 0

Personalized PageRank Based Feature Selection for High-dimension Data 基于个性化PageRank的高维数据特征选择

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919274

Zhibo Zhu, Qinke Peng, Xinyu Guan

{"title":"Personalized PageRank Based Feature Selection for High-dimension Data","authors":"Zhibo Zhu, Qinke Peng, Xinyu Guan","doi":"10.1109/KSE.2019.8919274","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919274","url":null,"abstract":"Feature selection is critical of data mining applications, especially for extracting valuable information from high-dimension data. It not only improves the performance of learning models, but also enhances the interpretability and generality of knowledge. In this paper, we propose a feature selection method based on the personalized PageRank. Derived from mutual information, a non-symmetrical metric is used to build a feature redundancy network firstly, in which nodes are features and directed edges represent the redundancy relation between features. Then, we compute the personalized PageRank on the network and assign a score for each feature as the redundancy measure given a specific feature subset. Finally, this redundancy integrates into the generalized MRMR framework to achieve the feature selection task. Due to the global characteristics of network and PageRank, our method can provide a better measure of the high-order relationship between the candidate feature and the subset of selected features. Extensive experiments conducted on five microarray datasets verify the effectiveness of the proposed method which outperforming popular benchmarks.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126631438","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Predicting Query Difficulty in IR: Impact of Difficulty Definition IR查询难度预测:难度定义的影响

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919433

J. Mothe, Léa Laporte, Adrian-Gabriel Chifu

引用次数: 6

Recovering Capitalization for Automatic Speech Recognition of Vietnamese using Transformer and Chunk Merging 基于转换和块合并的越南语语音自动识别的大写恢复

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919342

H. P. T. Thu, B. N. Thai, V. H. Nguyen, Quoc Truong Do, Luong Chi Mai, Huyen Thi Minh Nguyen

引用次数: 5

Automatic blotch removal using a perceptual approach 使用感知方法的自动斑点去除

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919321

Nguyen thi Quynh Hoa, Dang Thanh Trung, Azeddine Beghdadi, Heyfa Ammar, Amel Benazza

引用次数: 1

DEMOS: A Design Method for demOcratic information System DEMOS:民主信息系统的设计方法

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919427

Raphaëlle Bour, C. Soulé-Dupuy, N. Vallès-Parlangeau

引用次数: 0

Effective Text Data Preprocessing Technique for Sentiment Analysis in Social Media Data 面向社交媒体数据情感分析的有效文本数据预处理技术

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919368

Saurav Pradha, M. Halgamuge, N. Q. Vinh

{"title":"Effective Text Data Preprocessing Technique for Sentiment Analysis in Social Media Data","authors":"Saurav Pradha, M. Halgamuge, N. Q. Vinh","doi":"10.1109/KSE.2019.8919368","DOIUrl":"https://doi.org/10.1109/KSE.2019.8919368","url":null,"abstract":"In the big data era, data is made in real-time or closer to real-time. Thus, businesses can utilize this evergrowing volume of data for the data-driven or information-driven decision-making process to improve their businesses. Social media, like Twitter, generates an enormous amount of such data. However, social media data are often unstructured and difficult to manage. Hence, this study proposes an effective text data preprocessing technique and develop an algorithm to train the Support Vector Machine (SVM), Deep Learning (DL) and Naïve Bayes (NB) classifiers to process Twitter data. We develop an algorithm that weights the sentiment score in terms of weight of hashtag and cleaned text. In this study, we (i) compare different preprocessing techniques on the data collected from Twitter using various techniques such as (stemming, lemmatization and spelling correction) to obtain the efficient method (ii) develop an algorithm to weight the scores of the hashtag and cleaned text to obtain the sentiment. We retrieved N=1,314,000 Twitter data, and we compared the popularity of two products, Google Now and Amazon Alexa. Using our data preprocessing algorithm and sentiment weight score algorithm, we train SVM, DL, NB models. The results show that stemming technique performed best in terms of computational speed. Additionally, the accuracy of the algorithm was tested against manually sorted sentiments and sentiments produced before text data preprocessing. The result demonstrated that the impact produced by the algorithm was close to the manually annotated sentiments. In terms of model performance, the SVM performed better with the accuracy of 90.3%, perhaps, due to the unstructured nature of Twitter data. Previous studies used conventional techniques; hence, no precise methods were utilized on cleaning the text. Therefore, our approach confirms that proper text data preprocessing technique plays a significant role in the prediction accuracy and computational time of the classifier when using the unstructured Twitter data.","PeriodicalId":439841,"journal":{"name":"2019 11th International Conference on Knowledge and Systems Engineering (KSE)","volume":"283 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134092690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

A solution to ethical and legal problem with the decision-making model of autonomous vehicles 基于自动驾驶汽车决策模型的伦理与法律问题解决

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/KSE.2019.8919452

Quach Hai Tho, Huynh Cong Phap, Pham Anh Phuong

引用次数: 1

KSE 2019 Executive Committee KSE 2019执行委员会

2019 11th International Conference on Knowledge and Systems Engineering (KSE) Pub Date : 2019-10-01 DOI: 10.1109/kse.2019.8919298

引用次数: 0