新兴人工智能环境中识别新项目的一种众包半监督LSTM训练方法

Edoardo Serra, Haritha Akella, A. Cuzzocrea
{"title":"新兴人工智能环境中识别新项目的一种众包半监督LSTM训练方法","authors":"Edoardo Serra, Haritha Akella, A. Cuzzocrea","doi":"10.1109/ICMLA.2018.00241","DOIUrl":null,"url":null,"abstract":"Nowadays always new kinds of cuisines appear on the market. Even though main cuisines such as French, Italian, Japanese, Chinese and Indian are always appreciated, they are not anymore the most popular. The new trend is fusion cuisine. A fusion cuisine is a combination of different main cuisines, this combination makes this cuisine new. The opening of a new restaurant proposing a new kind of cuisine produces a lot of excitement and people feel the need to try it and be part of this new culture. Yelp is a platform which publishes crowd-sourced reviews about different businesses, in particular, restaurants. Yelp allows the possibility to declare for each restaurant the kind of cuisine. Unfortunately, since the restaurants in the Yelp database are not often generated by the owners but by the users creating the reviews, there is no much information about the kind of cuisine, especially for restaurants providing fusion ones. In this paper, we address the problem of identifying restaurants proposing new kinds of cuisines by using their Yelp reviews. These new cuisines can be completely new or fusion cuisines. Discriminating between main cuisines and fusion cuisines is very difficult because fusion cuisines are similar to the main ones even if they are conceptually different. We propose 4Phase, a semi-supervised procedure that trains Long Short-Term Memory with only the text reviews of the restaurants providing main cuisines. The trained LSTM is ultimately used as a feature generator in combination with a standard novelty detection model (e.g., Gaussian Mixture Models). We perform experiments on Yelp to separate restaurants providing main cuisines from the ones providing completely new cuisines or fusion ones. In this experiments, our 4Phase procedure outperforms all the baselines (term frequency, Doc2Vec, autoencoder LSTM, etc.) and reaches 0.91 of both AUROC and MAP.","PeriodicalId":6533,"journal":{"name":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"14 1","pages":"1479-1485"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Crowdsourcing Semi-Supervised LSTM Training Approach to Identify Novel Items in Emerging Artificial Intelligent Environments\",\"authors\":\"Edoardo Serra, Haritha Akella, A. Cuzzocrea\",\"doi\":\"10.1109/ICMLA.2018.00241\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays always new kinds of cuisines appear on the market. Even though main cuisines such as French, Italian, Japanese, Chinese and Indian are always appreciated, they are not anymore the most popular. The new trend is fusion cuisine. A fusion cuisine is a combination of different main cuisines, this combination makes this cuisine new. The opening of a new restaurant proposing a new kind of cuisine produces a lot of excitement and people feel the need to try it and be part of this new culture. Yelp is a platform which publishes crowd-sourced reviews about different businesses, in particular, restaurants. Yelp allows the possibility to declare for each restaurant the kind of cuisine. Unfortunately, since the restaurants in the Yelp database are not often generated by the owners but by the users creating the reviews, there is no much information about the kind of cuisine, especially for restaurants providing fusion ones. In this paper, we address the problem of identifying restaurants proposing new kinds of cuisines by using their Yelp reviews. These new cuisines can be completely new or fusion cuisines. Discriminating between main cuisines and fusion cuisines is very difficult because fusion cuisines are similar to the main ones even if they are conceptually different. We propose 4Phase, a semi-supervised procedure that trains Long Short-Term Memory with only the text reviews of the restaurants providing main cuisines. The trained LSTM is ultimately used as a feature generator in combination with a standard novelty detection model (e.g., Gaussian Mixture Models). We perform experiments on Yelp to separate restaurants providing main cuisines from the ones providing completely new cuisines or fusion ones. In this experiments, our 4Phase procedure outperforms all the baselines (term frequency, Doc2Vec, autoencoder LSTM, etc.) and reaches 0.91 of both AUROC and MAP.\",\"PeriodicalId\":6533,\"journal\":{\"name\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"14 1\",\"pages\":\"1479-1485\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2018.00241\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2018.00241","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

如今,市场上总是出现新的菜系。尽管法国、意大利、日本、中国和印度等主要菜系一直受到欢迎,但它们不再是最受欢迎的。新的趋势是融合烹饪。融合菜是不同主要菜系的结合,这种结合使这种菜变得新颖。一家新开的餐厅提出了一种新的烹饪方式,这让人非常兴奋,人们觉得有必要尝试一下,成为这种新文化的一部分。Yelp是一个发布关于不同企业,尤其是餐馆的众包评论的平台。Yelp允许为每个餐厅声明烹饪的种类。不幸的是,由于Yelp数据库中的餐馆通常不是由店主生成的,而是由创建评论的用户生成的,因此没有太多关于烹饪类型的信息,尤其是提供融合烹饪的餐馆。在本文中,我们解决了通过使用Yelp评论来识别提出新菜系的餐馆的问题。这些新菜系可以是全新的或融合的菜系。区分主要菜系和融合菜系是非常困难的,因为融合菜系与主要菜系相似,即使它们在概念上不同。我们提出了4Phase,这是一种半监督的程序,仅通过提供主要美食的餐馆的文字评论来训练长短期记忆。训练后的LSTM最终与标准的新新性检测模型(例如高斯混合模型)结合使用,作为特征生成器。我们在Yelp上进行实验,将提供主要菜系的餐厅与提供全新菜系或融合菜系的餐厅分开。在这个实验中,我们的4Phase过程优于所有基线(term frequency, Doc2Vec, autoencoder LSTM等),并且AUROC和MAP都达到了0.91。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Crowdsourcing Semi-Supervised LSTM Training Approach to Identify Novel Items in Emerging Artificial Intelligent Environments
Nowadays always new kinds of cuisines appear on the market. Even though main cuisines such as French, Italian, Japanese, Chinese and Indian are always appreciated, they are not anymore the most popular. The new trend is fusion cuisine. A fusion cuisine is a combination of different main cuisines, this combination makes this cuisine new. The opening of a new restaurant proposing a new kind of cuisine produces a lot of excitement and people feel the need to try it and be part of this new culture. Yelp is a platform which publishes crowd-sourced reviews about different businesses, in particular, restaurants. Yelp allows the possibility to declare for each restaurant the kind of cuisine. Unfortunately, since the restaurants in the Yelp database are not often generated by the owners but by the users creating the reviews, there is no much information about the kind of cuisine, especially for restaurants providing fusion ones. In this paper, we address the problem of identifying restaurants proposing new kinds of cuisines by using their Yelp reviews. These new cuisines can be completely new or fusion cuisines. Discriminating between main cuisines and fusion cuisines is very difficult because fusion cuisines are similar to the main ones even if they are conceptually different. We propose 4Phase, a semi-supervised procedure that trains Long Short-Term Memory with only the text reviews of the restaurants providing main cuisines. The trained LSTM is ultimately used as a feature generator in combination with a standard novelty detection model (e.g., Gaussian Mixture Models). We perform experiments on Yelp to separate restaurants providing main cuisines from the ones providing completely new cuisines or fusion ones. In this experiments, our 4Phase procedure outperforms all the baselines (term frequency, Doc2Vec, autoencoder LSTM, etc.) and reaches 0.91 of both AUROC and MAP.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信