{"title":"Scalable Neural Network Algorithms for High Dimensional Data","authors":"Mukesh Soni, Marwan Ali Shnan, Y. Bengio","doi":"10.58496/mjbd/2023/001","DOIUrl":"https://doi.org/10.58496/mjbd/2023/001","url":null,"abstract":"The boundary for machine learning engineers lately has moved from the restricted data to the algorithms' failure to involve every one of the data in the time permitted. Due of this, scientists are presently worried about the adaptability of machine learning algorithms notwithstanding their exactness. The key to success for many computer vision and machine learning challenges is having big training sets. A few published systematic reviews were taken into account in this topic. Recent systematic reviews may include both more recent and older research on the subject under study. Thus, the publications we examined were all recent. The review utilized information that were gathered somewhere in the range of 2010 and 2021. System: In this paper, we make a modified brain organization to eliminate possible components from extremely high layered datasets. Both a totaled level and an exceptionally fine-grained level of translation are feasible for these highlights. It is basically as easy to grasp non-straight connections as it is a direct relapse. We utilize the method on a dataset for item returns in web based shopping that has 15,555 aspects and 5,659,676 all out exchanges. Result and conclusion: We compare 87 various models to show that our approach not only produces higher predicted accuracy than existing techniques, but is also interpretable. The outcomes show that feature selection is a useful strategy for enhancing scalability. The method is sufficiently abstract to be used with many different analytics datasets","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131742431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"A Survey on Distributed Reinforcement Learning","authors":"Maroning Useng, Suleiman Avdulrahman","doi":"10.58496/mjbd/2022/006","DOIUrl":"https://doi.org/10.58496/mjbd/2022/006","url":null,"abstract":"Reinforcement learning (RL) has shown remarkable success in solving complex decision-making problems in various domains. However, traditional RL algorithms are often limited by their inability to handle large-scale and complex problems. Distributed reinforcement learning (DRL) is an emerging research field that aims to address these limitations by distributing the learning process across multiple agents or machines. In this paper, we provide a comprehensive survey of DRL, including its background, challenges, applications, evaluation, scalability, and open problems. We present a taxonomy of DRL methods and frameworks, and provide a comparative analysis of different DRL techniques. We also discuss the real-world applications of DRL in various domains, and highlight the challenges and limitations of applying DRL in practical scenarios. Furthermore, we evaluate the performance of DRL algorithms on benchmark tasks, and discuss current trends and future directions for evaluating DRL algorithms. We also discuss the techniques for improving the scalability and efficiency of DRL algorithms, including the approaches for distributed computing in DRL. Finally, we identify critical issues and challenges in DRL research, and provide recommendations for future research in this field. Overall, this survey aims to provide a comprehensive overview of the current state-of-the-art in DRL research and its applications.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133890771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Noah Mohammed Saleh, A. Saleh, Raed Abdulkareem Hasan, H. Mahdi
{"title":"The Renewable, Sustainable, and Clean Energy in Iraq Between Reality and Ambition According to the Paris Agreement on Climate Change","authors":"Noah Mohammed Saleh, A. Saleh, Raed Abdulkareem Hasan, H. Mahdi","doi":"10.58496/mjbd/2022/005","DOIUrl":"https://doi.org/10.58496/mjbd/2022/005","url":null,"abstract":"For quite some time now, Iraq has witnessed a great shortage, not only in the production of electric power, but even in the distribution system. In addition to this shortage, which exacerbates the problem is the large increase in the population of Iraq, in addition to the great problems that the country has experienced, especially the fierce confrontation with ISIS terrorist gangs, which drained a lot of Iraq’s human and material energies and negatively affected the energy reality in our country, not to mention the All or most of our electric power plants run on heavy fossil fuels and have old technology tracks. Iraq has power shortages, and there are various obstacles that must be solved in order to keep up with projected demand. Based on the results of this study, it appears that solar, wind, and biomass energy are underutilized at now but have the potential to significantly contribute to Iraq's renewable energy future. Wind power offshore in the Gulf (near Basrah in southern Iraq) also has untapped potential that has to be explored. There has been talk about the Iraqi government's efforts to harness green energy. The purpose of this article is to examine and debate the present and future of renewable energy in Iraq. Renewable energy applications such as solar, wind, and biomass have been discussed. Finally, suggestions for making use of various energy sources are provided.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133059187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Overview on Deep Leaning Application of Big Data","authors":"Rofia Abada, A. Abubakar, Muhammad Tayyab Bilal","doi":"10.58496/mjbd/2022/004","DOIUrl":"https://doi.org/10.58496/mjbd/2022/004","url":null,"abstract":"Big data refers to the large volumes of structured and unstructured data that are generated by businesses, organizations, and individuals on a daily basis. Deep learning is a type of machine learning that involves the use of artificial neural networks to learn patterns and relationships in data. In this paper, we discuss the applications of deep learning in the field of big data analysis. We provide an overview of deep learning and big data, and then delve into specific examples of how deep learning has been used in various domains to extract value from big data. These domains include predictive analytics, image and video analysis, natural language processing, and recommendation systems. We also discuss some of the challenges and limitations of using deep learning for big data analysis, as well as future directions for research and development in this field. Overall, deep learning has proven to be a powerful tool for extracting insights from big data, and is likely to play an increasingly important role in the field of data science.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125713713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Introduction to The Data Mining Techniques in Cybersecurity","authors":"Maad M. Mijwil","doi":"10.58496/mjcs/2022/004","DOIUrl":"https://doi.org/10.58496/mjcs/2022/004","url":null,"abstract":"As a result of the evolution of the Internet and the massive amount of data that is transmitted every second, as well as the methods for protecting and preserving it and distinguishing those who are authorized to view it, the role of cyber security has evolved to provide the best protection for information over the network. In this paper, the researcher discusses the role of data mining methods in cyber security. Data mining has several uses in security, including national security (for example, surveillance) and cyber security (e.g., virus detection). Attacks against buildings and the destruction of key infrastructure, such as power grids and telecommunications networks, are examples of national security concerns. Cybersecurity is concerned with safeguarding computer and network systems from harmful malware such as Trojan horses and viruses. In addition, data mining is being used to deliver solutions such as intrusion detection and auditing.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115442329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Taufik Gusman, Mohammad Naeemullah, Adeeb Mansoor Qasim
{"title":"Big Data Processing: A review","authors":"Taufik Gusman, Mohammad Naeemullah, Adeeb Mansoor Qasim","doi":"10.58496/mjbd/2022/003","DOIUrl":"https://doi.org/10.58496/mjbd/2022/003","url":null,"abstract":"Big data processing is a rapidly growing field that involves the collection, storage, and analysis of extremely large and complex data sets. It has the potential to transform the way organizations operate and make decisions, and it has been used in a wide range of industries and applications, including e-commerce, financial services, transportation, and more. In this paper, we provide an overview of big data processing, including its definitions, characteristics, and challenges. We also discuss the tools and technologies that are commonly used for big data processing, as well as the ethical considerations that are associated with this technology. Finally, we look at the future directions of big data processing and the trends and developments that are likely to shape this field in the coming years.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"52 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124020927","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Big Data Distributed Support Vector Machine","authors":"Baby Nirmala, Raed Abueid, Munef Abdullah Ahmed","doi":"10.58496/mjbd/2022/002","DOIUrl":"https://doi.org/10.58496/mjbd/2022/002","url":null,"abstract":"Data mining and machine learning (ML) methods are being used more than ever before in cyber security. The use of machine learning (ML) is one of the potential solutions that may be successful against zero day attacks, starting with the categorization of IP traffic and filtering harmful traffic for intrusion detection. In this field, certain published systematic reviews were taken into consideration. Contemporary systematic reviews may incorporate both older and more recent works in the topic of investigation. All of the papers we looked at were thus recent. Data from 2016 to 2021 were utilized in the study. Both security professionals and hackers use data mining capabilities. Applications for data mining may be used to analyze programme activity, surfing patterns, and other factors to identify potential cyber-attacks in the future. Utilizing statistical traffic features, ML, and data mining approaches, new study is being conducted. This research conducts a concentrated literature review on machine learning and its usage in cyber analytics for email filtering, traffic categorization, and intrusion detection. Each approach was identified and a summary provided based on the relevancy and quantity of citations. Some well-known datasets are also discussed since they are a crucial component of ML techniques. On when to utilize a certain algorithm is also offered some advice. On MODBUS data gathered from a gas pipeline, four ML algorithms have been evaluated. Using ML algorithms, different assaults have been categorized, and then the effectiveness of each approach has been evaluated. This study demonstrates the use of ML and data mining for threat research and detection, with a focus on malware detection with high accuracy and short detection times.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"2677 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126372064","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Big Data Machine Learning Using Apache Spark Mllib","authors":"Ziaul Hasan","doi":"10.58496/mjbd/2022/001","DOIUrl":"https://doi.org/10.58496/mjbd/2022/001","url":null,"abstract":"The examination local area has utilized man-made brainpower, and specifically machine learning, in various ways to change various unique and, surprisingly, heterogeneous data sources into excellent realities and information, offering driving capacities to exact example finding. In any case, utilizing machine learning strategies on enormous and convoluted datasets is computationally costly and utilizes a great deal of coherent and actual assets, including central processor, memory, and data record space.In the current study collected the review of different researchers from 2010 to 2022. As how much data produced consistently arrives at quintillions of bytes, it is turning out to be more pivotal than any other time in recent memory to have a vigorous stage for powerful big data examination. Quite possibly of the most notable big datum investigation stages is Apache Spark MLlib, which gives various extraordinary capabilities for machine learning applications like relapse, grouping, aspect decrease, bunching, and rule extraction. This study's hidden reason is that Spark ML's big data execution and precision are fundamentally better than Spark Mllib's. The dataset for bank client exchanges is utilized in the correlation. We are probably not going to have the option to handle the sums and sorts of data we are managing with conventional programming arrangements. Thus, present day big data handling innovations that can disperse and deal with data in a versatile way are either coordinated into or taken over by conventional business knowledge (BI) frameworks. Big data innovation can likewise assist us with learning more about security, which can be found from colossal databases. The big data examination motor Apache Spark is utilized in the review to introduce a security-related data investigation.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134442359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Distributed Reduced Convolution Neural Networks","authors":"Mohammad Alajanbi, D. Malerba, He Liu","doi":"10.58496/mjbd/2021/005","DOIUrl":"https://doi.org/10.58496/mjbd/2021/005","url":null,"abstract":"The fields of pattern recognition and machine learning frequently make use of something called a Convolution Neural Network, or CNN for short. The kernel extension of CNN, often known as KCNN, offers a performance that is superior to that of conventional CNN. When working with a large-size kernel matrix, the KCNN takes a lot of time and requires a lot of memory, despite the fact that it is capable of solving difficult nonlinear problems. The implementation of a reduced kernel approach has the potential to significantly lower the amount of computational burden and memory consumption. However, since the total quantity of training data continues to expand at an exponential rate, it becomes impossible for a single worker to store the kernel matrix in an efficient manner. This renders centralized data mining impossible to implement. A distributed reduced kernel approach for training CNN on decentralized data, which is referred to as DRCNN, is proposed in this study. In the DRCNN, we will arbitrarily distribute the data to the various nodes. The communication between nodes is static and does not depend on the amount of training data stored on each node; instead, it is determined by the architecture of the network. In contrast to the reduced kernel CNN that is already in use, the DRCNN is a completely distributed training technique that is based on the approach of alternating direction multiplier (ADMM). Experimentation with the large size data set reveals that the distributed technique can produce virtually the same outcomes as the centralized algorithm, and it even requires less time to a significant amount. It results in a significant decrease in the amount of time needed for computation.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123390897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
M. Yaseen, Mohammad Naeemullah, Ibarhim Adeb Mansoor
{"title":"Parallel Generalized Hebbian Algorithm for Large Scale Data Analytics","authors":"M. Yaseen, Mohammad Naeemullah, Ibarhim Adeb Mansoor","doi":"10.58496/mjbd/2021/003","DOIUrl":"https://doi.org/10.58496/mjbd/2021/003","url":null,"abstract":"In order to store and analyse large amounts of data on a parallel cluster, Big Data Systems such as Hadoop and DBMSs require a complex configuration and tuning procedure. This is mostly the result of static partitioning occurring whenever data sets are imported into the file system or transferred into it. Following that, parallel processing is carried out in a distributed fashion, with the objective of achieving balanced parallel execution among nodes. The system is notoriously difficult to configure, particularly in the areas of node synchronisation, data redistribution, and distributed caching in main memory. The extended Hebbian algorithm, abbreviated as GHA, is a linear feedforward neural network model for unsupervised learning that finds the majority of its applications in principle components analysis. Sanger's rule is another name for the GHA that may be found in the academic literature. Its formulation and stability, with the additional feature that it may be used to networks that have more than one output. A unique hardware architecture for principal component analysis is presented here in the form of a paper. The Generalized Hebbian Algorithm (GHA) was chosen as the foundation for the design because to the fact that it is both straightforward and efficient. The architecture may be broken down into three distinct parts: the memory unit, the weight vector updating unit, and the primary computing unit. Within the weight vector updating unit, the computation of various synaptic weight vectors uses the same circuit in order to cut down on the area expenses. This is done in order to save space. The GHA architecture incorporates a versatile multi-computer framework that is based on mpi. Therefore, GHA may be efficiently executed on platforms that utilise either sequential processing or parallel processing. When the data set is studied for a short period of time or when a dynamic number of virtual processors is selected at runtime, we predict that our architecture will be able to profit from parallel processing on the cloud. In this research, a parallel implementation of a variety of machine learning algorithms that are built on top of the MapReduce paradigm is presented with the purpose of improving processing speed and saving time.","PeriodicalId":325612,"journal":{"name":"Mesopotamian Journal of Big Data","volume":"119 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123064441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}