{"title":"Construction of a Bayesian network as an extension of propositional logic","authors":"Takuto Enomoto, M. Kimura","doi":"10.5220/0005595102110217","DOIUrl":"https://doi.org/10.5220/0005595102110217","url":null,"abstract":"A Bayesian network is a probabilistic graphical model. Many conventional methods have been proposed for its construction. However, these methods often result in an incorrect Bayesian network structure. In this study, to correctly construct a Bayesian network, we extend the concept of propositional logic. We propose a methodology for constructing a Bayesian network with causal relationships that are extracted only if the antecedent states are true. In order to determine the logic to be used in constructing the Bayesian network, we propose the use of association rule mining such as the Apriori algorithm. We evaluate the proposed method by comparing its result with that of traditional method, such as Bayesian Dirichlet equivalent uniform (BDeu) score evaluation with a hill climbing algorithm, that shows that our method generates a network with more necessary arcs than that generated by the traditional method.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115211207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Domain-specific Relation Extraction: using distant supervision Machine Learning","authors":"Abduladem Aljamel, T. Osman, G. Acampora","doi":"10.5220/0005615100920103","DOIUrl":"https://doi.org/10.5220/0005615100920103","url":null,"abstract":"The increasing accessibility and availability of online data provides a valuable knowledge source for information analysis and decision-making processes. In this paper we argue that extracting information from this data is better guided by domain knowledge of the targeted use-case and investigate the integration of a knowledge-driven approach with Machine Learning techniques in order to improve the quality of the Relation Extraction process. Targeting the financial domain, we use Semantic Web Technologies to build the domain Knowledgebase, which is in turn exploited to collect distant supervision training data from semantic linked datasets such as DBPedia and Freebase. We conducted a serious of experiments that utilise the number of Machine Learning algorithms to report on the favourable implementations/configuration for successful Information Extraction for our targeted domain.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125278602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Chinese-keyword fuzzy search and extraction over encrypted patent documents","authors":"Wei-Ze Ding, Yongji Liu, Jianfeng Zhang","doi":"10.5220/0005581001680176","DOIUrl":"https://doi.org/10.5220/0005581001680176","url":null,"abstract":"Cloud storage for information sharing is likely indispensable to the future national defence library in China e.g., for searching national defence patent documents, while security risks need to be maximally avoided using data encryption. Patent keywords are the high-level summary of the patent document, and it is significant in practice to efficiently extract and search the key words in the patent documents. Due to the particularity of Chinese keywords, most existing algorithms in English language environment become ineffective in Chinese scenarios. For extracting the keywords from patent documents, the manual keyword extraction is inappropriate when the amount of files is large. An improved method based on the term frequency-inverse document frequency (TF-IDF) is proposed to auto-extract the keywords in the patent literature. The extracted keyword sets also help to accelerate the keyword search by linking finite keywords with a large amount of documents. Fuzzy keyword search is introduced to further increase the search efficiency in the cloud computing scenarios compared to exact keyword search methods. Based on the Chinese Pinyin similarity, a Pinyin-Gram-based algorithm is proposed for fuzzy search in encrypted Chinese environment, and a keyword trapdoor search index structure based on the n-ary tree is designed. Both the search efficiency and accuracy of the proposed scheme are verified through computer experiments.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125564793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Topic oriented auto-completion models: Approaches towards fastening auto-completion systems","authors":"S. Prisca, M. Dînsoreanu, C. Lemnaru","doi":"10.5220/0005597502410248","DOIUrl":"https://doi.org/10.5220/0005597502410248","url":null,"abstract":"In this paper we propose an autocompletion approach suitable for mobile devices that aims to reduce the overall data model size and to speed up query processing while not employing any language specific processing. The approach relies on topic information from input documents to split the data models based on topics and index them in a way that allows fast identification through their corresponding topic. Doing so, the size of the data model used for prediction is decreased to almost one fifth of the size of a model that contains all topics, and the query processing becomes two times faster, while maintaining the same precision obtained by employing a model that contains all topics.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"82 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127600512","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"POS tagging-probability weighted method for matching the Internet recipe ingredients with food composition data","authors":"T. Eftimov, B. Korousic-Seljak","doi":"10.5220/0005612303300336","DOIUrl":"https://doi.org/10.5220/0005612303300336","url":null,"abstract":"In this paper, we present a new method that can be used for matching recipe ingredients extracted from the Internet to nutritional data from food composition databases (FCDBs). The method uses part of speech tagging (POS tagging) to capture the information from the names of the ingredients and the names of the food analyses from FCDBs. Then, probability weighted model is presented, which takes into account the information from POS tagging to assign the weight on each match and the match with the highest weight is used as the most relevant one and can be used for further analyses. We evaluated our method using a collection of 721 lunch recipes, from which we extracted 1,615 different ingredients and the result showed that our method can match 91.82% of the ingredients with the FCDB.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126546800","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Predicting stock market movement: An evolutionary approach","authors":"S. Bouktif, M. Awad","doi":"10.5220/0005578401590167","DOIUrl":"https://doi.org/10.5220/0005578401590167","url":null,"abstract":"Social Networks are becoming very popular sources of all kind of data. They allow a wide range of users to interact, socialize and express spontaneous opinions. The overwhelming amount of exchanged data on businesses, companies and governments make it possible to perform predictions and discover trends in many domains. In this paper we propose a new prediction model for the stock market movement problem based on collective classification. The model is using a number of public mood states as inputs to predict Up and Down movement of stock market. The proposed approach to build such a model is simultaneously promoting performance and interpretability. By interpretability, we mean the ability of a model to explain its predictions. A particular implementation of our approach is based on Ant Colony Optimization algorithm and customized for individual Bayesian classifiers. Our approach is validated with data collected from social media on the stock of a prestigious company. Promising results of our approach are compared with four alternative prediction methods namely, bagging, Adaboost, best expert, and expert trained on all the available data.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"2005 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Capodieci, D. D'Aprile, G. Elia, F. Grippa, L. Mainetti
{"title":"Visualizing cultural digital resources using social network analysis","authors":"A. Capodieci, D. D'Aprile, G. Elia, F. Grippa, L. Mainetti","doi":"10.5220/0005585801860194","DOIUrl":"https://doi.org/10.5220/0005585801860194","url":null,"abstract":"This paper describes the design and implementation of a prototype to extract, collect and visually analyse cultural digital resources using social network analysis empowered with semantic features. An initial experiment involved the collection and visualization of connections between cultural digital resources - and their providers - stored in the platform DiCet (an Italian Living Lab centred on Cultural Heritage and Technology). This step helped to identify the most appropriate relational data model to use for the social network visualization phase. We then run a second experiment using a web application designed to extract relevant data from the platform Europeana.eu. The actors in our two-mode networks are Cultural Heritage Objects (CHOs) shared by institutional and individual providers, such as galleries, museums, individual experts and content aggregators. The links connecting nodes represent the digital resources associated to the CHOs. The application of the prototype offers insights on the most prominent providers, digital resources and cultural objects over time. Through the application of semantic analysis, we were also able to identify the most used words and the related sentiment associated to them.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121440914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
E. Oliveira, Howard Roatti, Matheus de Araujo Nogueira, Henrique Gomes Basoni, P. M. Ciarelli
{"title":"Using the cluster-based tree structure of k-nearest neighbor to reduce the effort required to classify unlabeled large datasets","authors":"E. Oliveira, Howard Roatti, Matheus de Araujo Nogueira, Henrique Gomes Basoni, P. M. Ciarelli","doi":"10.5220/0005615305670576","DOIUrl":"https://doi.org/10.5220/0005615305670576","url":null,"abstract":"The usual practice in the classification problem is to create a set of labeled data for training and then use it to tune a classifier for predicting the classes of the remaining items in the dataset. However, labeled data demand great human effort, and classification by specialists is normally expensive and consumes a large amount of time. In this paper, we discuss how we can benefit from a cluster-based tree kNN structure to quickly build a training dataset from scratch. We evaluated the proposed method on some classification datasets, and the results are promising because we reduced the amount of labeling work by the specialists to 4% of the number of documents in the evaluated datasets. Furthermore, we achieved an average accuracy of 72.19% on tested datasets, versus 77.12% when using 90% of the dataset for training.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134081562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tarek Alloui, I. Boussebough, A. Chaoui, A. Nouar, Mohamed Chettah
{"title":"Usearch: A Meta Search Engine based on a new result merging strategy","authors":"Tarek Alloui, I. Boussebough, A. Chaoui, A. Nouar, Mohamed Chettah","doi":"10.5220/0005642905310536","DOIUrl":"https://doi.org/10.5220/0005642905310536","url":null,"abstract":"Meta Search Engines are finding tools developed for improving the search performance by submitting user queries to multiple search engines and combining the different search results in a unified ranked list. The effectiveness of a Meta search engine is closely related to the result merging strategy it employs. But nowadays, the main issue in the conception of such systems is the merging strategy of the returned results. With only the user query as relevant information about his information needs, it's hard to use it to find the best ranking of the merged results. We present in this paper a new strategy of merging multiple search engine results using only the user query as a relevance criterion. We propose a new score function combining the similarity between user query and retrieved results and the users' satisfaction toward used search engines. The proposed Meta search engine can be used for merging search results of any set of search engines.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"250 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114463453","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Prediction of Earnings Per Share for industry","authors":"Swati Jadhav, Hongmei He, K. Jenkins","doi":"10.5220/0005616604250432","DOIUrl":"https://doi.org/10.5220/0005616604250432","url":null,"abstract":"Prediction of Earnings Per Share (EPS) is the fundamental problem in finance industry. Various Data Mining technologies have been widely used in computational finance. This research work aims to predict the future EPS with previous values through the use of data mining technologies, thus to provide decision makers a reference or evidence for their economic strategies and business activity. We created three models LR, RBF and MLP for the regression problem. Our experiments with these models were carried out on the real datasets provided by a software company. The performance assessment was based on Correlation Coefficient and Root Mean Squared Error. These algorithms were validated with the data of six different companies. Some differences between the models have been observed. In most cases, Linear Regression and Multilayer Perceptron are effectively capable of predicting the future EPS. But for the high nonlinear data, MLP gives better performance.","PeriodicalId":102743,"journal":{"name":"2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K)","volume":"192 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116868182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}