Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen
{"title":"葡萄酒信息学:在计算酒轮处理的葡萄酒感官评论中应用数据挖掘","authors":"Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen","doi":"10.1109/ICDMW.2014.149","DOIUrl":null,"url":null,"abstract":"As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Wineinformatics: Applying Data Mining on Wine Sensory Reviews Processed by the Computational Wine Wheel\",\"authors\":\"Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen\",\"doi\":\"10.1109/ICDMW.2014.149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.\",\"PeriodicalId\":289269,\"journal\":{\"name\":\"2014 IEEE International Conference on Data Mining Workshop\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Data Mining Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2014.149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Data Mining Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2014.149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Wineinformatics: Applying Data Mining on Wine Sensory Reviews Processed by the Computational Wine Wheel
As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.