Gourav Beriwal, Alexander Bohn, Addison Josey, Jordan Kohn, Christopher A Sherrick, Donald E. Brown, R. Bailey
{"title":"将数据映射到客户端项目的方法","authors":"Gourav Beriwal, Alexander Bohn, Addison Josey, Jordan Kohn, Christopher A Sherrick, Donald E. Brown, R. Bailey","doi":"10.1109/SIEDS.2016.7489333","DOIUrl":null,"url":null,"abstract":"In today's technological era, the volume of data being processed on a daily basis is growing exponentially. The influence of data is affecting consulting firms and the expectations client companies have of consultants. Consulting firms are expected to not only use their client's data but also to leverage open source data and data from other firms that can help address the clients' needs. Manually reviewing the relevance of a dataset to a given problem is time intensive and prone to error. Additionally, it is near impossible to detect non-intuitive correlations in the data through manual review. Hence, this project focuses on building an application that enables users to automatically find and assess the applicability of datasets to a given problem. Given that the project objectives are to increase the applicability of individual datasets, to increase the applicability of clusters of correlated datasets, and to increase the usability of individual datasets, three metrics were derived respectively. They are relevance, coverage, and quality of data. The application, known as UVa Open Miner, measures the relevance of the dataset to the client problem, the degree to which a group of datasets covers different dimensions of a client problem, and the quality of the dataset. The application can be refactored into a reusable solution that consulting firms can use for their client work.","PeriodicalId":426864,"journal":{"name":"2016 IEEE Systems and Information Engineering Design Symposium (SIEDS)","volume":"43 16","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Approach for mapping data to client projects\",\"authors\":\"Gourav Beriwal, Alexander Bohn, Addison Josey, Jordan Kohn, Christopher A Sherrick, Donald E. Brown, R. Bailey\",\"doi\":\"10.1109/SIEDS.2016.7489333\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In today's technological era, the volume of data being processed on a daily basis is growing exponentially. The influence of data is affecting consulting firms and the expectations client companies have of consultants. Consulting firms are expected to not only use their client's data but also to leverage open source data and data from other firms that can help address the clients' needs. Manually reviewing the relevance of a dataset to a given problem is time intensive and prone to error. Additionally, it is near impossible to detect non-intuitive correlations in the data through manual review. Hence, this project focuses on building an application that enables users to automatically find and assess the applicability of datasets to a given problem. Given that the project objectives are to increase the applicability of individual datasets, to increase the applicability of clusters of correlated datasets, and to increase the usability of individual datasets, three metrics were derived respectively. They are relevance, coverage, and quality of data. The application, known as UVa Open Miner, measures the relevance of the dataset to the client problem, the degree to which a group of datasets covers different dimensions of a client problem, and the quality of the dataset. The application can be refactored into a reusable solution that consulting firms can use for their client work.\",\"PeriodicalId\":426864,\"journal\":{\"name\":\"2016 IEEE Systems and Information Engineering Design Symposium (SIEDS)\",\"volume\":\"43 16\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE Systems and Information Engineering Design Symposium (SIEDS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SIEDS.2016.7489333\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE Systems and Information Engineering Design Symposium (SIEDS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SIEDS.2016.7489333","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
在当今的科技时代,每天处理的数据量呈指数级增长。数据的影响正在影响咨询公司和客户公司对咨询师的期望。咨询公司不仅要使用客户的数据,还要利用开源数据和其他公司的数据来帮助解决客户的需求。手动检查数据集与给定问题的相关性是费时且容易出错的。此外,通过人工审查几乎不可能检测到数据中的非直观相关性。因此,这个项目的重点是构建一个应用程序,使用户能够自动发现和评估数据集对给定问题的适用性。鉴于该项目的目标是提高单个数据集的适用性,提高相关数据集簇的适用性,以及提高单个数据集的可用性,分别推导了三个指标。它们是数据的相关性、覆盖范围和质量。这个被称为UVa Open Miner的应用程序可以测量数据集与客户端问题的相关性,一组数据集覆盖客户端问题不同维度的程度,以及数据集的质量。应用程序可以重构为可重用的解决方案,咨询公司可以将其用于客户工作。
In today's technological era, the volume of data being processed on a daily basis is growing exponentially. The influence of data is affecting consulting firms and the expectations client companies have of consultants. Consulting firms are expected to not only use their client's data but also to leverage open source data and data from other firms that can help address the clients' needs. Manually reviewing the relevance of a dataset to a given problem is time intensive and prone to error. Additionally, it is near impossible to detect non-intuitive correlations in the data through manual review. Hence, this project focuses on building an application that enables users to automatically find and assess the applicability of datasets to a given problem. Given that the project objectives are to increase the applicability of individual datasets, to increase the applicability of clusters of correlated datasets, and to increase the usability of individual datasets, three metrics were derived respectively. They are relevance, coverage, and quality of data. The application, known as UVa Open Miner, measures the relevance of the dataset to the client problem, the degree to which a group of datasets covers different dimensions of a client problem, and the quality of the dataset. The application can be refactored into a reusable solution that consulting firms can use for their client work.