{"title":"数据科学的扩展:数据集标准化","authors":"Nuno Pessanha Santos","doi":"10.3390/standards3040028","DOIUrl":null,"url":null,"abstract":"With recent advances in science and technology, more processing capability and data have become available, allowing a more straightforward implementation of data analysis techniques. Fortunately, available online data storage capacity follows this trend, and vast amounts of data can be stored online freely or at accessible costs. As happens with every evolution (or revolution) in any science field, organizing and sharing these data is essential to contribute to new studies or validate obtained results quickly. To facilitate this, we must guarantee interoperability between existing datasets and developed software, whether commercial or open-source. This article explores this issue and analyzes the current initiatives to establish data standards and compares some of the existing online dataset storage platforms. Through a Strengths, Weaknesses, Opportunities, and Threats (SWOT) analysis, it is possible to better understand the strategy that should be taken to improve the efficiency in this field, which directly depends on the data’s characteristics. The development of dataset standards will directly increase the collaboration and data sharing between academia and industry, allowing faster research and development through direct interoperability.","PeriodicalId":21933,"journal":{"name":"Standards","volume":"107 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Expansion of Data Science: Dataset Standardization\",\"authors\":\"Nuno Pessanha Santos\",\"doi\":\"10.3390/standards3040028\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With recent advances in science and technology, more processing capability and data have become available, allowing a more straightforward implementation of data analysis techniques. Fortunately, available online data storage capacity follows this trend, and vast amounts of data can be stored online freely or at accessible costs. As happens with every evolution (or revolution) in any science field, organizing and sharing these data is essential to contribute to new studies or validate obtained results quickly. To facilitate this, we must guarantee interoperability between existing datasets and developed software, whether commercial or open-source. This article explores this issue and analyzes the current initiatives to establish data standards and compares some of the existing online dataset storage platforms. Through a Strengths, Weaknesses, Opportunities, and Threats (SWOT) analysis, it is possible to better understand the strategy that should be taken to improve the efficiency in this field, which directly depends on the data’s characteristics. The development of dataset standards will directly increase the collaboration and data sharing between academia and industry, allowing faster research and development through direct interoperability.\",\"PeriodicalId\":21933,\"journal\":{\"name\":\"Standards\",\"volume\":\"107 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Standards\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/standards3040028\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Standards","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/standards3040028","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Expansion of Data Science: Dataset Standardization
With recent advances in science and technology, more processing capability and data have become available, allowing a more straightforward implementation of data analysis techniques. Fortunately, available online data storage capacity follows this trend, and vast amounts of data can be stored online freely or at accessible costs. As happens with every evolution (or revolution) in any science field, organizing and sharing these data is essential to contribute to new studies or validate obtained results quickly. To facilitate this, we must guarantee interoperability between existing datasets and developed software, whether commercial or open-source. This article explores this issue and analyzes the current initiatives to establish data standards and compares some of the existing online dataset storage platforms. Through a Strengths, Weaknesses, Opportunities, and Threats (SWOT) analysis, it is possible to better understand the strategy that should be taken to improve the efficiency in this field, which directly depends on the data’s characteristics. The development of dataset standards will directly increase the collaboration and data sharing between academia and industry, allowing faster research and development through direct interoperability.