{"title":"基于MapReduce的海量数据集分布式推理方法","authors":"M. Priadarsini, M. Dharani","doi":"10.1109/ICCCI56745.2023.10128196","DOIUrl":null,"url":null,"abstract":"Contemporary computer systems and applications generate high volume of data every day. Gaining knowledge from this ever-growing high velocity and high volume data is crucial to have insights and business intelligence. Using semantic web approaches for generating inferences to gain knowledge have been quite successful. When processing large amounts of data, a centralised method for finding inferences in ontologies will be ineffective. Therefore, to solve this problem, a distributed strategy is needed. The major challenges on large scale data are the difficulty in deriving suitable triples for appropriate inferences, to reduce the time spent in processing of inference and the requirement of scalable computation capabilities for large dataset. Also, storage space for increasing data must be addressed efficiently. This paper proposes a distributed conjecture approach to address the above issues by construction of SIM (Sparse Index Method) and ATC (Assertional Triples Construction) and to efficiently process the users’ queries.","PeriodicalId":205683,"journal":{"name":"2023 International Conference on Computer Communication and Informatics (ICCCI)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Distributed Inference Approach on Massive datasets using MapReduce\",\"authors\":\"M. Priadarsini, M. Dharani\",\"doi\":\"10.1109/ICCCI56745.2023.10128196\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Contemporary computer systems and applications generate high volume of data every day. Gaining knowledge from this ever-growing high velocity and high volume data is crucial to have insights and business intelligence. Using semantic web approaches for generating inferences to gain knowledge have been quite successful. When processing large amounts of data, a centralised method for finding inferences in ontologies will be ineffective. Therefore, to solve this problem, a distributed strategy is needed. The major challenges on large scale data are the difficulty in deriving suitable triples for appropriate inferences, to reduce the time spent in processing of inference and the requirement of scalable computation capabilities for large dataset. Also, storage space for increasing data must be addressed efficiently. This paper proposes a distributed conjecture approach to address the above issues by construction of SIM (Sparse Index Method) and ATC (Assertional Triples Construction) and to efficiently process the users’ queries.\",\"PeriodicalId\":205683,\"journal\":{\"name\":\"2023 International Conference on Computer Communication and Informatics (ICCCI)\",\"volume\":\"6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Conference on Computer Communication and Informatics (ICCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCI56745.2023.10128196\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Conference on Computer Communication and Informatics (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI56745.2023.10128196","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distributed Inference Approach on Massive datasets using MapReduce
Contemporary computer systems and applications generate high volume of data every day. Gaining knowledge from this ever-growing high velocity and high volume data is crucial to have insights and business intelligence. Using semantic web approaches for generating inferences to gain knowledge have been quite successful. When processing large amounts of data, a centralised method for finding inferences in ontologies will be ineffective. Therefore, to solve this problem, a distributed strategy is needed. The major challenges on large scale data are the difficulty in deriving suitable triples for appropriate inferences, to reduce the time spent in processing of inference and the requirement of scalable computation capabilities for large dataset. Also, storage space for increasing data must be addressed efficiently. This paper proposes a distributed conjecture approach to address the above issues by construction of SIM (Sparse Index Method) and ATC (Assertional Triples Construction) and to efficiently process the users’ queries.