{"title":"Understanding Negative Sampling in Knowledge Graph Embedding","authors":"Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue","doi":"10.5121/IJAIA.2021.12105","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12105","url":null,"abstract":"Knowledge graph embedding (KGE) is to project entities and relations of a knowledge graph (KG) into a low-dimensional vector space, which has made steady progress in recent years. Conventional KGE methods, especially translational distance-based models, are trained through discriminating positive samples from negative ones. Most KGs store only positive samples for space efficiency. Negative sampling thus plays a crucial role in encoding triples of a KG. The quality of generated negative samples has a direct impact on the performance of learnt knowledge representation in a myriad of downstream tasks, such as recommendation, link prediction and node classification. We summarize current negative sampling approaches in KGE into three categories, static distribution-based, dynamic distribution-based and custom cluster-based respectively. Based on this categorization we discuss the most prevalent existing approaches and their characteristics. It is a hope that this review can provide some guidelines for new thoughts about negative sampling in KGE.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"71-81"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42547409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Towards Predicting Software Defects with Clustering Techniques","authors":"Waheeda Almayyan","doi":"10.5121/IJAIA.2021.12103","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12103","url":null,"abstract":"The purpose of software defect prediction is to improve the quality of a software project by building a predictive model to decide whether a software module is or is not fault prone. In recent years, much research in using machine learning techniques in this topic has been performed. Our aim was to evaluate the performance of clustering techniques with feature selection schemes to address the problem of software defect prediction problem. We analyzed the National Aeronautics and Space Administration (NASA) dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) self-organizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer (GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number of features.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"39-54"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45706980","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Analysis of Enterprise Shared Resource Invocation Scheme based on Hadoop and R","authors":"H. Xiong","doi":"10.5121/IJAIA.2021.12104","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12104","url":null,"abstract":"The response rate and performance indicators of enterprise resource calls have become an important part of measuring the difference in enterprise user experience. An efficient corporate shared resource calling system can significantly improve the office efficiency of corporate users and significantly improve the fluency of corporate users' resource calling. Hadoop has powerful data integration and analysis capabilities in resource extraction, while R has excellent statistical capabilities and resource personalized decomposition and display capabilities in data calling. This article will propose an integration plan for enterprise shared resource invocation based on Hadoop and R to further improve the efficiency of enterprise users' shared resource utilization, improve the efficiency of system operation, and bring enterprise users a higher level of user experience. First, we use Hadoop to extract the corporate shared resources required by corporate users from the nearby resource storage computer room and terminal equipment to increase the call rate, and use the R function attribute to convert the user’s search results into linear correlations, according to the correlation The strong and weak principles are displayed in order to improve the corresponding speed and experience. This article proposes feasible solutions to the shortcomings in the current enterprise shared resource invocation. We can use public data sets to perform personalized regression analysis on user needs, and optimize and integrate most relevant information.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"55-69"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45532643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Supervised and Unsupervised Machine Learning Methodologies for Crime Pattern Analysis","authors":"D. Sardana, S. Marwaha, R. Bhatnagar","doi":"10.5121/IJAIA.2021.12106","DOIUrl":"https://doi.org/10.5121/IJAIA.2021.12106","url":null,"abstract":"Crime is a grave problem that affects all countries in the world. The level of crime in a country has a big impact on its economic growth and quality of life of citizens. In this paper, we provide a survey of trends of supervised and unsupervised machine learning methods used for crime pattern analysis. We use a spatiotemporal dataset of crimes in San Francisco, CA to demonstrate some of these strategies for crime analysis. We use classification models, namely, Logistic Regression, Random Forest, Gradient Boosting and Naive Bayes to predict crime types such as Larceny, Theft, etc. and propose model optimization strategies. Further, we use a graph based unsupervised machine learning technique called core periphery structures to analyze how crime behavior evolves over time. These methods can be generalized to use for different counties and can be greatly helpful in planning police task forces for law enforcement and crime prevention.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"12 1","pages":"83-99"},"PeriodicalIF":0.0,"publicationDate":"2021-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49171386","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Brief Survey of Question Answering Systems","authors":"Michael Caballero","doi":"10.5121/ijaia.2021.12501","DOIUrl":"https://doi.org/10.5121/ijaia.2021.12501","url":null,"abstract":"Question Answering (QA) is a subfield of Natural Language Processing (NLP) and computer science focused on building systems that automatically answer questions from humans in natural language. This survey summarizes the history and current state of the field and is intended as an introductory overview of QA systems. After discussing QA history, this paper summarizes the different approaches to the architecture of QA systems -- whether they are closed or open-domain and whether they are text-based, knowledge-based, or hybrid systems. Lastly, some common datasets in this field are introduced and different evaluation metrics are discussed.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"70613602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
F. Z. Flores, Rosalba Cuapa Canto, José María Ángeles López
{"title":"Answer Set Programming to Model Plan Agent Scenarios","authors":"F. Z. Flores, Rosalba Cuapa Canto, José María Ángeles López","doi":"10.5121/ijaia.2020.11606","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11606","url":null,"abstract":"One of the most challenging aspects of reasoning, planning, and acting in an agent domain is reasoning about what an agent knows about their environment to consider when planning and acting. There are various proposals that have addressed this problem using modal, epistemic and other logics. In this paper we explore how to take advantage of the properties of Answer Set Programming for this purpose. The Answer Set Programming's property of non-monotonicity allow us to express causality in an elegant fashion. We begin our discussion by showing how Answer Set Programming can be used to model the frog’s problem. We then illustrate how this problem can be represented and solved using these concepts. In addition, our proposal allows us to solve the generalization of this problem, that is, for any number of frogs.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"55-63"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45228625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
H. Kono, Yuto Sakamoto, Yonghoon Ji, Hiromitsu Fujii
{"title":"Automatic Transfer Rate Adjustment for Transfer Reinforcement Learning","authors":"H. Kono, Yuto Sakamoto, Yonghoon Ji, Hiromitsu Fujii","doi":"10.5121/ijaia.2020.11605","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11605","url":null,"abstract":"This paper proposes a novel parameter for transfer reinforcement learning to avoid over-fitting when an agent uses a transferred policy from a source task. Learning robot systems have recently been studied for many applications, such as home robots, communication robots, and warehouse robots. However, if the agent reuses the knowledge that has been sufficiently learned in the source task, deadlock may occur and appropriate transfer learning may not be realized. In the previous work, a parameter called transfer rate was proposed to adjust the ratio of transfer, and its contribution include avoiding dead lock in the target task. However, adjusting the parameter depends on human intuition and experiences. Furthermore, the method for deciding transfer rate has not discussed. Therefore, an automatic method for adjusting the transfer rate is proposed in this paper using a sigmoid function. Further, computer simulations are used to evaluate the effectiveness of the proposed method to improve the environmental adaptation performance in a target task, which refers to the situation of reusing knowledge.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"47-54"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42345937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Intelligent Portfolio Management via NLP Analysis of Financial 10-k Statements","authors":"Purva Singh","doi":"10.5121/ijaia.2020.11602","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11602","url":null,"abstract":"The paper attempts to analyze if the sentiment stability of financial 10-K reports over time can determine the company’s future mean returns. A diverse portfolio of stocks was selected to test this hypothesis. The proposed framework downloads 10-K reports of the companies from SEC’s EDGAR database. It passes them through the preprocessing pipeline to extract critical sections of the filings to perform NLP analysis. Using Loughran and McDonald sentiment word list, the framework generates sentiment TF-IDF from the 10-K documents to calculate the cosine similarity between two consecutive 10-K reports and proposes to leverage this cosine similarity as the alpha factor. For analyzing the effectiveness of our alpha factor at predicting future returns, the framework uses the alphalens library to perform factor return analysis, turnover analysis, and for comparing the Sharpe ratio of potential alpha factors. The results show that there exists a strong correlation between the sentiment stability of our portfolio’s 10-K statements and its future mean returns. For the benefit of the research community, the code and Jupyter notebooks related to this paper have been open-sourced on Github1.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"13-25"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46494005","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using Contextual Graphs as a Decision-making Tool in the Process of Hiring Candidates","authors":"H. Tahir, P. Brézillon","doi":"10.5121/ijaia.2020.11604","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11604","url":null,"abstract":"Poor selection of employees can be a first step towards a lack of motivation, poor performance, and high turnover, to name a few. It's no wonder that organizations are trying to find the best ways to avoid these slippages by finding the best possible person for the job. Therefore, it is very important to understand the context of hiring process to help to understand which recruiting mistakes are most damaging to the organization in order to reduce the recruiting challenges faced by Human resource managers by building their capacity to ensure optimal HR performance. This paper initiates a research about how Contextual Graphs Formalism can be used for improving the decision making in the process of hiring potential candidates. An example of a typical procedure for visualization of recruiting phases is presented to show how to add contextual elements and practices in order to communicate the recruitment policy in a concrete and memorable way to both hiring teams and candidates.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"37-46"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46863610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Problem Decomposition and Information Minimization for the Global, Concurrent, On-line Validation of Neutron Noise Signals and Neutron Detector Operation","authors":"Tatiana Tambouratzis","doi":"10.5121/ijaia.2020.11601","DOIUrl":"https://doi.org/10.5121/ijaia.2020.11601","url":null,"abstract":"This piece of research introduces a purely data-driven, directly reconfigurable, divide-and-conquer on-line monitoring (OLM) methodology for automatically selecting the minimum number of neutron detectors (NDs) – and corresponding neutron noise signals (NSs) – which are currently necessary, as well as sufficient, for inspecting the entire nuclear reactor (NR) in-core area. The proposed implementation builds upon the 3-tuple configuration, according to which three sufficiently pairwise-correlated NSs are capable of on-line (I) verifying each NS of the 3-tuple and (II) endorsing correct functioning of each corresponding ND, implemented herein via straightforward pairwise comparisons of fixed-length sliding time-windows (STWs) between the three NSs of the 3-tuple. A pressurized water NR (PWR) model – developed for H2020 CORTEX – is used for deriving the optimal ND/NS configuration, where (i) the evident partitioning of the 36 NDs/NSs into six clusters of six NDs/NSs each, and (ii) the high cross-correlations (CCs) within every 3-tuple of NSs, endorse the use of a constant pair comprising the two most highly CC-ed NSs per cluster as the first two members of the 3-tuple, with the third member being each remaining NS of the cluster, in turn, thereby computationally streamlining OLM without compromising the identification of either deviating NSs or malfunctioning NDs. Tests on the in-core dataset of the PWR model demonstrate the potential of the proposed methodology in terms of suitability for, efficiency at, as well as robustness in ND/NS selection, further establishing the “directly reconfigurable” property of the proposed approach at every point in time while using one-third only of the original NDs/NSs.","PeriodicalId":93188,"journal":{"name":"International journal of artificial intelligence & applications","volume":"11 1","pages":"1-12"},"PeriodicalIF":0.0,"publicationDate":"2020-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45024519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}