2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)最新文献

筛选
英文 中文
Data-driven relation discovery from unstructured texts 从非结构化文本中发现数据驱动关系
Marilena Ditta, Fabrizio Milazzo, V. Ravì, G. Pilato, A. Augello
{"title":"Data-driven relation discovery from unstructured texts","authors":"Marilena Ditta, Fabrizio Milazzo, V. Ravì, G. Pilato, A. Augello","doi":"10.5220/0005614205970602","DOIUrl":"https://doi.org/10.5220/0005614205970602","url":null,"abstract":"This work proposes a data driven methodology for the extraction of subject-verb-object triplets from a text corpus. Previous works on the field solved the problem by means of complex learning algorithms requiring hand-crafted examples; our proposal completely avoids learning triplets from a dataset and is built on top of a well-known baseline algorithm designed by Delia Rusu et al.. The baseline algorithm uses only syntactic information for generating triplets and is characterized by a very low precision i.e., very few triplets are meaningful. Our idea is to integrate the semantics of the words with the aim of filtering out the wrong triplets, thus increasing the overall precision of the system. The algorithm has been tested over the Reuters Corpus and has it as shown good performance with respect to the baseline algorithm for triplet extraction.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"121 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123579014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Use of frequent itemset mining techniques to analyze business processes 使用频繁的项目集挖掘技术来分析业务流程
Vladimír Bartík, Milan Pospísil
{"title":"Use of frequent itemset mining techniques to analyze business processes","authors":"Vladimír Bartík, Milan Pospísil","doi":"10.5220/0005598102730280","DOIUrl":"https://doi.org/10.5220/0005598102730280","url":null,"abstract":"Analysis of business process data can be used to discover reasons of delays and other problems in a business process. This paper presents an approach, which uses a simulator of production history. This simulator allows detecting problems at various production machines, e.g. extremely long queues of products waiting before a machine. After detection, data about products processed before the queue increased are collected. Frequent itemsets obtained from this dataset can be used to describe the problem and reasons of it. The whole process of frequent itemset mining will be described in this paper. It is also focused on description of several necessary modifications of basic methods usually used to discover frequent itemsets.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128526608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
The reverse doubling construction 反向加倍结构
Jean-François Viaud, K. Bertet, C. Demko, R. Missaoui
{"title":"The reverse doubling construction","authors":"Jean-François Viaud, K. Bertet, C. Demko, R. Missaoui","doi":"10.5220/0005613203500357","DOIUrl":"https://doi.org/10.5220/0005613203500357","url":null,"abstract":"It is well known inside the Formal Concept Analysis (FCA) community that a concept lattice could have an exponential size in the data. Hence, the size of concept lattices is a critical issue in the presence of large real-life data sets. In this paper, we propose to investigate factor lattices as a tool to get meaningful parts of the whole lattice. These factor lattices have been widely studied from the early theory of lattices to more recent work in the FCA field. This paper contains two parts. The first one gives background about lattice theory and formal concept analysis, and mainly compatible sub-contexts, arrow-closed sub-contexts and congruence relations. The second part presents a new decomposition called “reverse doubling construction” that exploits the above three notions used for the doubling convex construction investigated by Day. Theoretical results and their proofs are given as well as an illustrative example.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"122 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132444748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Early diagnosis of Alzheimer's disease using machine learning techniques: A review paper 使用机器学习技术早期诊断阿尔茨海默病:综述论文
Aunsia Khan, Muhammad Usman
{"title":"Early diagnosis of Alzheimer's disease using machine learning techniques: A review paper","authors":"Aunsia Khan, Muhammad Usman","doi":"10.5220/0005615203800387","DOIUrl":"https://doi.org/10.5220/0005615203800387","url":null,"abstract":"Alzheimer's, an irreparable brain disease, impairs thinking and memory while the aggregate mind size shrinks which at last prompts demise. Early diagnosis of AD is essential for the progress of more prevailing treatments. Machine learning (ML), a branch of artificial intelligence, employs a variety of probabilistic and optimization techniques that permits PCs to gain from vast and complex datasets. As a result, researchers focus on using machine learning frequently for diagnosis of early stages of AD. This paper presents a review, analysis and critical evaluation of the recent work done for the early detection of AD using ML techniques. Several methods achieved promising prediction accuracies, however they were evaluated on different pathologically unproven data sets from different imaging modalities making it difficult to make a fair comparison among them. Moreover, many other factors such as pre-processing, the number of important attributes for feature selection, class imbalance distinctively affect the assessment of the prediction accuracy. To overcome these limitations, a model is proposed which comprise of initial pre-processing step followed by imperative attributes selection and classification is achieved using association rule mining. Furthermore, this proposed model based approach gives the right direction for research in early diagnosis of AD and has the potential to distinguish AD from healthy controls.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126829687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 45
Bringing search engines to the cloud using open source components 使用开源组件将搜索引擎带到云端
K. Nagi
{"title":"Bringing search engines to the cloud using open source components","authors":"K. Nagi","doi":"10.5220/0005632701160126","DOIUrl":"https://doi.org/10.5220/0005632701160126","url":null,"abstract":"The usage of search engines is nowadays extended to do intelligent analytics of petabytes of data. With Lucene being at the heart of the vast majority of information retrieval systems, several attempts are made to bring it to the cloud in order to scale to big data. Efforts include implementing scalable distribution of the search indices over the file system, storing them in NoSQL databases, and porting them to inherently distributed ecosystems, such as Hadoop. We evaluate the existing efforts in terms of distribution, high availability, fault tolerance, manageability, and high performance. We believe that the key to supporting search indexing capabilities for big data can only be achieved through the use of common open-source technology to be deployed on standard cloud platforms such as Amazon EC2, Microsoft Azure, etc. For each approach, we build a benchmarking system by indexing the whole Wikipedia content and submitting hundreds of simultaneous search requests. We measure the performance of both indexing and searching operations. We stimulate node failures and monitor the recoverability of the system. We show that a system built on top of Solr and Hadoop has the best stability and manageability; while systems based on NoSQL databases present an attractive alternative in terms of performance.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116556357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Automatic extraction of task statements from structured meeting content 从结构化会议内容中自动提取任务语句
K. Nagao, Keisuke Inoue, Naoya Morita, S. Matsubara
{"title":"Automatic extraction of task statements from structured meeting content","authors":"K. Nagao, Keisuke Inoue, Naoya Morita, S. Matsubara","doi":"10.5220/0005609703070315","DOIUrl":"https://doi.org/10.5220/0005609703070315","url":null,"abstract":"We previously developed a discussion mining system that records face-to-face meetings in detail, analyzes their content, and conducts knowledge discovery. Looking back on past discussion content by browsing documents, such as minutes, is an effective means for conducting future activities. In meetings at which some research topics are regularly discussed, such as seminars in laboratories, the presenters are required to discuss future issues by checking urgent matters from the discussion records. We call statements including advice or requests proposed at previous meetings “task statements” and propose a method for automatically extracting them. With this method, based on certain semantic attributes and linguistic characteristics of statements, a probabilistic model is created using the maximum entropy method. A statement is judged whether it is a task statement according to its probability. A seminar-based experiment validated the effectiveness of the proposed extraction method.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127612970","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Arabic sentiment analysis using WEKA a hybrid learning approach 阿拉伯语情感分析使用WEKA混合学习方法
S. Alhumoud, Tarfa Albuhairi, Mawaheb Altuwaijri
{"title":"Arabic sentiment analysis using WEKA a hybrid learning approach","authors":"S. Alhumoud, Tarfa Albuhairi, Mawaheb Altuwaijri","doi":"10.5220/0005616004020408","DOIUrl":"https://doi.org/10.5220/0005616004020408","url":null,"abstract":"Data has become the currency of this era and it is continuing to massively increase in size and generation rate. Large data generated out of organisations' e-transactions or individuals through social networks could be of a great value when analysed properly. This research presents an implementation of a sentiment analyser for Twitter's tweets which is one of the biggest public and freely available big data sources. It analyses Arabic, Saudi dialect tweets to extract sentiments toward a specific topic. It used a dataset consisting of 3000 tweets collected from Twitter. The collected tweets were analysed using two machine learning approaches, supervised which is trained with the dataset collected and the proposed hybrid learning which is trained on a single words dictionary. Two algorithms are used, Support Vector Machine (SVM) and K-Nearest Neighbors (KNN). The obtained results by the cross validation on the same dataset clearly confirm the superiority of the hybrid learning approach over the supervised approach.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115562052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Comparison of sampling size estimation techniques for association rule mining 关联规则挖掘中采样大小估计技术的比较
Tugba Halici, U. Ketenci
{"title":"Comparison of sampling size estimation techniques for association rule mining","authors":"Tugba Halici, U. Ketenci","doi":"10.5220/0005589801950202","DOIUrl":"https://doi.org/10.5220/0005589801950202","url":null,"abstract":"Fast and complete retrieval of individual customer needs and “to the point” product offers are crucial aspects of customer satisfaction in todays' highly competitive banking sector. Growing number of transactions and customers have excessively boosted the need for time and memory in market basket analysis. In this paper, sampling process is included into analysis aiming to increase the performance of a product offer system. The core logic of a sample, is to dig for smaller representative of the universe, that is to generate accurate association rules. A smaller sample of the universe reduces the elapsed time and the memory consumption devoted to market basket analysis. Based on this content; the sampling methods, the sampling size estimation techniques and the representativeness tests are examined. The technique, which gives complete set of association rules in a reduced amount of time, is suggested for sampling retail banking data.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"abs/1606.08164 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125243865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Piecewise Chebyshev factorization based nearest neighbour classification for time series 基于分段切比雪夫分解的时间序列最近邻分类
Qinglin Cai, Ling Chen, Jianling Sun
{"title":"Piecewise Chebyshev factorization based nearest neighbour classification for time series","authors":"Qinglin Cai, Ling Chen, Jianling Sun","doi":"10.5220/0005611900840091","DOIUrl":"https://doi.org/10.5220/0005611900840091","url":null,"abstract":"In the research field of time series analysis and mining, the nearest neighbour classifier (1NN) based on dynamic time warping distance (DTW) is well known for its high accuracy. However, the high computational complexity of DTW can lead to the expensive time consumption of classification. An effective solution is to compute DTW in the piecewise approximation space (PA-DTW), which transforms the raw data into the feature space based on segmentation, and extracts the discriminatory features for similarity measure. However, most of existing piecewise approximation methods need to fix the segment length, and focus on the simple statistical features, which would influence the precision of PA-DTW. To address this problem, we propose a novel piecewise factorization model for time series, which uses an adaptive segmentation method and factorizes the subsequences with the Chebyshev polynomials. The Chebyshev coefficients are extracted as features for PA-DTW measure (ChebyDTW), which are able to capture the fluctuation information of time series. The comprehensive experimental results show that ChebyDTW can support the accurate and fast 1NN classification.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132110828","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Word Sense Discrimination on tweets: A graph-based approach 推文的词义辨析:基于图的方法
F. M. Cecchini, E. Fersini, E. Messina
{"title":"Word Sense Discrimination on tweets: A graph-based approach","authors":"F. M. Cecchini, E. Fersini, E. Messina","doi":"10.5220/0005640501380146","DOIUrl":"https://doi.org/10.5220/0005640501380146","url":null,"abstract":"In this paper we are going to detail an unsupervised, graph-based approach for word sense discrimination on tweets. We deal with this problem by constructing a word graph of co-occurrences. By defining a distance on this graph, we obtain a word metric space, on which we can apply an aggregative algorithm for word clustering. As a result, we will get word clusters representing contexts that discriminate the possible senses of a term. We present some experimental results both on a data set consisting of tweets we collected and on the data set of task 14 at SemEval-2010.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127928355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信