Automatic Artificial Data Generator: Framework and implementation

Syahaneim, Raja Asilah Hazwani, N. Wahida, Siti Intan Shafikah, Zuraini, Puteri Nor Ellyza
{"title":"Automatic Artificial Data Generator: Framework and implementation","authors":"Syahaneim, Raja Asilah Hazwani, N. Wahida, Siti Intan Shafikah, Zuraini, Puteri Nor Ellyza","doi":"10.1109/ICICTM.2016.7890777","DOIUrl":null,"url":null,"abstract":"Extracting unknown and possibly useful information from a set of examples that has desired features is crucial and important for data analysis and interpretation. Normally, a public repository has become the most used method in attempting to find a suitable domain. However, relying on the available data in the public repository has several disadvantages. In this case, an automatic problem generation system would be valuable to provide several advantages over the traditional methods. This paper focuses more on data extraction and artificial data generation. Here, a framework is proposed that consists of four main phases: 1) Data extraction, 2) Data characterization, 3) Artificial data generation and 4) Artificial data creation. The approach systematically creates testing datasets based on real data that is extracted from a reliable sources. The system uses random permutation algorithm to generate a large number of artificial data that resembles real data.","PeriodicalId":340409,"journal":{"name":"2016 International Conference on Information and Communication Technology (ICICTM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on Information and Communication Technology (ICICTM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICICTM.2016.7890777","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Extracting unknown and possibly useful information from a set of examples that has desired features is crucial and important for data analysis and interpretation. Normally, a public repository has become the most used method in attempting to find a suitable domain. However, relying on the available data in the public repository has several disadvantages. In this case, an automatic problem generation system would be valuable to provide several advantages over the traditional methods. This paper focuses more on data extraction and artificial data generation. Here, a framework is proposed that consists of four main phases: 1) Data extraction, 2) Data characterization, 3) Artificial data generation and 4) Artificial data creation. The approach systematically creates testing datasets based on real data that is extracted from a reliable sources. The system uses random permutation algorithm to generate a large number of artificial data that resembles real data.
自动人工数据生成器:框架与实现
从一组具有所需特征的示例中提取未知的和可能有用的信息对于数据分析和解释至关重要。通常,公共存储库已成为尝试查找合适域的最常用方法。然而,依赖公共存储库中的可用数据有几个缺点。在这种情况下,一个自动问题生成系统将是有价值的,因为它提供了优于传统方法的几个优点。本文的重点是数据提取和人工数据生成。本文提出了一个由四个主要阶段组成的框架:1)数据提取,2)数据表征,3)人工数据生成和4)人工数据创建。该方法基于从可靠来源提取的真实数据系统地创建测试数据集。该系统采用随机排列算法生成大量与真实数据相似的人工数据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信