{"title":"基于越南语文本文档的抄袭检测系统数据仓库设计","authors":"P. Ho, T. Vo, Ngoc Anh Thi Nguyen","doi":"10.1109/ICSSE.2017.8030873","DOIUrl":null,"url":null,"abstract":"In this paper, the significance role of data warehouse designing for textual anti-plagiarism system is investigated. The paper covers the central issues of data warehousing modeling including: (1) formulating the data representation, (2) establishing the foundations of storage structure, (3) proposing corresponding architecture allowing to store, update and manage data. Consequently, two levels are considered in this paper to address the above mentioned research axes. First, at a theoretical level, the objective is to introduce novel and practical contributions in the area of textual document-based plagiarism system. The chosen approach is proposed to collect, analysis and store textual dataset. Secondly, at an implementation level, the paper focuses on the platform for processing the data, calling to modeling exhibits promising capabilities such as support for real-time, new sources of data, and self-service capabilities. The real application is performed in Vietnamese text-based document by conducting documents containing final reports/assignments, dissertations of master/Ph.D and research scientific papers applied for the University of Danang. The contribution of the paper is not only provide values to all researchers, educators and students in the university of Danang systems but also be considered as seminal work to develop plagiarism in our further next investigation of building a big-data warehouse severing for a automatic duplicate system.","PeriodicalId":296191,"journal":{"name":"2017 International Conference on System Science and Engineering (ICSSE)","volume":"131 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Data warehouse designing for Vietnamese textual document-based plagiarism detection system\",\"authors\":\"P. Ho, T. Vo, Ngoc Anh Thi Nguyen\",\"doi\":\"10.1109/ICSSE.2017.8030873\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the significance role of data warehouse designing for textual anti-plagiarism system is investigated. The paper covers the central issues of data warehousing modeling including: (1) formulating the data representation, (2) establishing the foundations of storage structure, (3) proposing corresponding architecture allowing to store, update and manage data. Consequently, two levels are considered in this paper to address the above mentioned research axes. First, at a theoretical level, the objective is to introduce novel and practical contributions in the area of textual document-based plagiarism system. The chosen approach is proposed to collect, analysis and store textual dataset. Secondly, at an implementation level, the paper focuses on the platform for processing the data, calling to modeling exhibits promising capabilities such as support for real-time, new sources of data, and self-service capabilities. The real application is performed in Vietnamese text-based document by conducting documents containing final reports/assignments, dissertations of master/Ph.D and research scientific papers applied for the University of Danang. The contribution of the paper is not only provide values to all researchers, educators and students in the university of Danang systems but also be considered as seminal work to develop plagiarism in our further next investigation of building a big-data warehouse severing for a automatic duplicate system.\",\"PeriodicalId\":296191,\"journal\":{\"name\":\"2017 International Conference on System Science and Engineering (ICSSE)\",\"volume\":\"131 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on System Science and Engineering (ICSSE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSE.2017.8030873\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on System Science and Engineering (ICSSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSE.2017.8030873","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Data warehouse designing for Vietnamese textual document-based plagiarism detection system
In this paper, the significance role of data warehouse designing for textual anti-plagiarism system is investigated. The paper covers the central issues of data warehousing modeling including: (1) formulating the data representation, (2) establishing the foundations of storage structure, (3) proposing corresponding architecture allowing to store, update and manage data. Consequently, two levels are considered in this paper to address the above mentioned research axes. First, at a theoretical level, the objective is to introduce novel and practical contributions in the area of textual document-based plagiarism system. The chosen approach is proposed to collect, analysis and store textual dataset. Secondly, at an implementation level, the paper focuses on the platform for processing the data, calling to modeling exhibits promising capabilities such as support for real-time, new sources of data, and self-service capabilities. The real application is performed in Vietnamese text-based document by conducting documents containing final reports/assignments, dissertations of master/Ph.D and research scientific papers applied for the University of Danang. The contribution of the paper is not only provide values to all researchers, educators and students in the university of Danang systems but also be considered as seminal work to develop plagiarism in our further next investigation of building a big-data warehouse severing for a automatic duplicate system.