葡萄酒信息学:在计算酒轮处理的葡萄酒感官评论中应用数据挖掘

Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen
{"title":"葡萄酒信息学:在计算酒轮处理的葡萄酒感官评论中应用数据挖掘","authors":"Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen","doi":"10.1109/ICDMW.2014.149","DOIUrl":null,"url":null,"abstract":"As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.","PeriodicalId":289269,"journal":{"name":"2014 IEEE International Conference on Data Mining Workshop","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"36","resultStr":"{\"title\":\"Wineinformatics: Applying Data Mining on Wine Sensory Reviews Processed by the Computational Wine Wheel\",\"authors\":\"Bernard Chen, Christopher Rhodes, Aaron Crawford, Lorri Hambuchen\",\"doi\":\"10.1109/ICDMW.2014.149\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.\",\"PeriodicalId\":289269,\"journal\":{\"name\":\"2014 IEEE International Conference on Data Mining Workshop\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"36\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Conference on Data Mining Workshop\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDMW.2014.149\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Conference on Data Mining Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2014.149","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 36

摘要

随着世界变得更加数字化,数据科学是一门成功的研究,它融合了来自不同领域的各种技术和理论。在所有领域中,领域知识可能是最重要的,因为所有数据科学研究人员都需要从领域问题开始,并以领域内的有用信息结束。识别新的应用领域一直被认为是该领域的基础性研究。在过去,酒被认为是一种奢侈品;然而,今天它很受欢迎,受到各种各样的人的喜爱。专业的葡萄酒评论每年提供成千上万种葡萄酒的见解。然而,目前还没有系统的方法来利用这些大量的评论来使葡萄酒制造商、经销商和消费者受益。这个项目提出了一个全新的数据科学领域——葡萄酒信息学。为了从以人类语言格式存储的评论中自动检索葡萄酒的风味和特征,我们提出了一种新颖的“计算酒轮”来提取关键词。本文基于我们的新方法生成了两个不同的公共可用数据集。对第一个数据集应用分层聚类算法,得到有意义的聚类结果。在第二个数据集上执行关联规则算法,根据葡萄酒的风味评价预测葡萄酒是否得分在90分以上。基于不同参数进行了5次交叉验证实验,得到了准确度在73%~82%之间的结果。这个新领域将给计算机科学、统计学、商业和农业等不同领域带来巨大的好处。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Wineinformatics: Applying Data Mining on Wine Sensory Reviews Processed by the Computational Wine Wheel
As the world becomes more digital, data Science is the successful study that incorporates varying techniques and theories from distinct fields. Among all fields, the domain knowledge might be the most important since all data science researchers need to start with the domain problem, and end with useful information within the domain. Identifying new application domain is always considered as fundamental research in the area. Wine was considered as a luxury in old days; however, it is popular and enjoyed by a wide variety of people today. Professional wine reviews provide insights on tens of thousands wines available each year. However, currently, there is no systematic way to utilize those large number reviews to benefit wine makers, distributers and consumers. This project proposes a brand new data science area named Wineinformatics. In order to automatically retrieve wines' flavors and characteristics from reviews, which are stored in the human language format, we propose a novel “Computational Wine Wheel” to extract key words. Two different public-available datasets are produced based on our new method in this paper. Hierarchical clustering algorithm is applied on the first dataset and retrieved meaningful clustering results. Association rules algorithm is performed on the second dataset to predict whether a wine is scored above 90 point or not based on the wine savory reviews. 5-fold cross validation experiments are executed based on different parameters and results with a range of 73%~82% accuracy are generated. This new domain will bring huge benefits to fields as diverse as computer science, statistics, business and agriculture.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信