{"title":"迈向大数据中的知识发现","authors":"Richard K. Lomotey, R. Deters","doi":"10.1109/SOSE.2014.25","DOIUrl":null,"url":null,"abstract":"Analytics-as-a-Service (AaaS) has become indispensable because it affords stakeholders to discover knowledge in Big Data. Previously, data stored in data warehouses follow some schema and standardization which leads to efficient data mining. However, the Big Data epoch has witnessed the rise of structured, semi-structured, and unstructured data, a trend that motivated enterprises to employ the NoSQL data storages to accommodate the high-dimensional data. Unfortunately, the existing data mining techniques which are designed for schema-oriented storages are non-applicable to the unstructured data style. Thus, the AaaS though still in its infancy, is gaining widespread attention for its ability to provide novel ways and opportunities to mine the heterogeneous data. In this paper, we discuss our AaaS tool that performs terms and topics extraction and organization from unstructured data sources such as NoSQL databases, textual contents (e.g., websites), and structured sources (e.g. SQL). The tool is built on methodologies such as tagging, filtering, association maps, and adaptable dictionary. The evaluation of the tool shows high accuracy in the mining process.","PeriodicalId":360538,"journal":{"name":"2014 IEEE 8th International Symposium on Service Oriented System Engineering","volume":"26 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"41","resultStr":"{\"title\":\"Towards Knowledge Discovery in Big Data\",\"authors\":\"Richard K. Lomotey, R. Deters\",\"doi\":\"10.1109/SOSE.2014.25\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Analytics-as-a-Service (AaaS) has become indispensable because it affords stakeholders to discover knowledge in Big Data. Previously, data stored in data warehouses follow some schema and standardization which leads to efficient data mining. However, the Big Data epoch has witnessed the rise of structured, semi-structured, and unstructured data, a trend that motivated enterprises to employ the NoSQL data storages to accommodate the high-dimensional data. Unfortunately, the existing data mining techniques which are designed for schema-oriented storages are non-applicable to the unstructured data style. Thus, the AaaS though still in its infancy, is gaining widespread attention for its ability to provide novel ways and opportunities to mine the heterogeneous data. In this paper, we discuss our AaaS tool that performs terms and topics extraction and organization from unstructured data sources such as NoSQL databases, textual contents (e.g., websites), and structured sources (e.g. SQL). The tool is built on methodologies such as tagging, filtering, association maps, and adaptable dictionary. The evaluation of the tool shows high accuracy in the mining process.\",\"PeriodicalId\":360538,\"journal\":{\"name\":\"2014 IEEE 8th International Symposium on Service Oriented System Engineering\",\"volume\":\"26 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"41\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE 8th International Symposium on Service Oriented System Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SOSE.2014.25\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 8th International Symposium on Service Oriented System Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SOSE.2014.25","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Analytics-as-a-Service (AaaS) has become indispensable because it affords stakeholders to discover knowledge in Big Data. Previously, data stored in data warehouses follow some schema and standardization which leads to efficient data mining. However, the Big Data epoch has witnessed the rise of structured, semi-structured, and unstructured data, a trend that motivated enterprises to employ the NoSQL data storages to accommodate the high-dimensional data. Unfortunately, the existing data mining techniques which are designed for schema-oriented storages are non-applicable to the unstructured data style. Thus, the AaaS though still in its infancy, is gaining widespread attention for its ability to provide novel ways and opportunities to mine the heterogeneous data. In this paper, we discuss our AaaS tool that performs terms and topics extraction and organization from unstructured data sources such as NoSQL databases, textual contents (e.g., websites), and structured sources (e.g. SQL). The tool is built on methodologies such as tagging, filtering, association maps, and adaptable dictionary. The evaluation of the tool shows high accuracy in the mining process.