Syahaneim, Raja Asilah Hazwani, N. Wahida, Siti Intan Shafikah, Zuraini, Puteri Nor Ellyza
{"title":"自动人工数据生成器:框架与实现","authors":"Syahaneim, Raja Asilah Hazwani, N. Wahida, Siti Intan Shafikah, Zuraini, Puteri Nor Ellyza","doi":"10.1109/ICICTM.2016.7890777","DOIUrl":null,"url":null,"abstract":"Extracting unknown and possibly useful information from a set of examples that has desired features is crucial and important for data analysis and interpretation. Normally, a public repository has become the most used method in attempting to find a suitable domain. However, relying on the available data in the public repository has several disadvantages. In this case, an automatic problem generation system would be valuable to provide several advantages over the traditional methods. This paper focuses more on data extraction and artificial data generation. Here, a framework is proposed that consists of four main phases: 1) Data extraction, 2) Data characterization, 3) Artificial data generation and 4) Artificial data creation. The approach systematically creates testing datasets based on real data that is extracted from a reliable sources. The system uses random permutation algorithm to generate a large number of artificial data that resembles real data.","PeriodicalId":340409,"journal":{"name":"2016 International Conference on Information and Communication Technology (ICICTM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"Automatic Artificial Data Generator: Framework and implementation\",\"authors\":\"Syahaneim, Raja Asilah Hazwani, N. Wahida, Siti Intan Shafikah, Zuraini, Puteri Nor Ellyza\",\"doi\":\"10.1109/ICICTM.2016.7890777\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extracting unknown and possibly useful information from a set of examples that has desired features is crucial and important for data analysis and interpretation. Normally, a public repository has become the most used method in attempting to find a suitable domain. However, relying on the available data in the public repository has several disadvantages. In this case, an automatic problem generation system would be valuable to provide several advantages over the traditional methods. This paper focuses more on data extraction and artificial data generation. Here, a framework is proposed that consists of four main phases: 1) Data extraction, 2) Data characterization, 3) Artificial data generation and 4) Artificial data creation. The approach systematically creates testing datasets based on real data that is extracted from a reliable sources. The system uses random permutation algorithm to generate a large number of artificial data that resembles real data.\",\"PeriodicalId\":340409,\"journal\":{\"name\":\"2016 International Conference on Information and Communication Technology (ICICTM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on Information and Communication Technology (ICICTM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICICTM.2016.7890777\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Information and Communication Technology (ICICTM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICTM.2016.7890777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Automatic Artificial Data Generator: Framework and implementation
Extracting unknown and possibly useful information from a set of examples that has desired features is crucial and important for data analysis and interpretation. Normally, a public repository has become the most used method in attempting to find a suitable domain. However, relying on the available data in the public repository has several disadvantages. In this case, an automatic problem generation system would be valuable to provide several advantages over the traditional methods. This paper focuses more on data extraction and artificial data generation. Here, a framework is proposed that consists of four main phases: 1) Data extraction, 2) Data characterization, 3) Artificial data generation and 4) Artificial data creation. The approach systematically creates testing datasets based on real data that is extracted from a reliable sources. The system uses random permutation algorithm to generate a large number of artificial data that resembles real data.