宝莱坞电影成功预测的比较分析

Ritu Khandelwal, Harshita Virwani
{"title":"宝莱坞电影成功预测的比较分析","authors":"Ritu Khandelwal, Harshita Virwani","doi":"10.2139/ssrn.3350907","DOIUrl":null,"url":null,"abstract":"In terms of Film Production, Indian Cinema is one of the largest Film Industry where more than thousands of movies released each year. The large part of Indian Cinema is Bollywood which is multi-million dollar industry. So, this Paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Data Mining techniques (classification and prediction) will be applied. Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems which come in business and helps to predict the forthcoming trends. The main difference between classification and prediction is that the classification predicts the definitive categorical labels and prediction techniques predicts the continuous values. Every big business is engaging a large amount of data so after analyzing them the company can define different strategies to achieve business aims and profits. For these, there are various techniques available with different data visualization tools such as Microsoft Power BI, Tableau, Logi Analytics, Orange, etc. To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. All the techniques related to classification and Prediction such as Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. These Algorithms would be applied in Orange. The orange tool is Open-Source software which incorporates Data Mining, Machine Learning and Data Visualization which makes our work easy and everything in one place. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, Evaluate. This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the Best possible model for predicting the movie Success. This Prediction can help Production Houses for Advertisement Propaganda and they can plan their costs and by assuring these factors they can make the movie more profitable. By this, they can also decide as which is the best time to release the movie according to the predicted success rate defined by mode to gain higher benefits.<br>","PeriodicalId":312881,"journal":{"name":"SUSCOM 2019: Proceedings","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Comparative Analysis for Prediction of Success of Bollywood Movie\",\"authors\":\"Ritu Khandelwal, Harshita Virwani\",\"doi\":\"10.2139/ssrn.3350907\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In terms of Film Production, Indian Cinema is one of the largest Film Industry where more than thousands of movies released each year. The large part of Indian Cinema is Bollywood which is multi-million dollar industry. So, this Paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Data Mining techniques (classification and prediction) will be applied. Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems which come in business and helps to predict the forthcoming trends. The main difference between classification and prediction is that the classification predicts the definitive categorical labels and prediction techniques predicts the continuous values. Every big business is engaging a large amount of data so after analyzing them the company can define different strategies to achieve business aims and profits. For these, there are various techniques available with different data visualization tools such as Microsoft Power BI, Tableau, Logi Analytics, Orange, etc. To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. All the techniques related to classification and Prediction such as Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. These Algorithms would be applied in Orange. The orange tool is Open-Source software which incorporates Data Mining, Machine Learning and Data Visualization which makes our work easy and everything in one place. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, Evaluate. This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the Best possible model for predicting the movie Success. This Prediction can help Production Houses for Advertisement Propaganda and they can plan their costs and by assuring these factors they can make the movie more profitable. By this, they can also decide as which is the best time to release the movie according to the predicted success rate defined by mode to gain higher benefits.<br>\",\"PeriodicalId\":312881,\"journal\":{\"name\":\"SUSCOM 2019: Proceedings\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-02-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"SUSCOM 2019: Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2139/ssrn.3350907\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"SUSCOM 2019: Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3350907","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

在电影制作方面,印度电影是最大的电影工业之一,每年发行的电影超过数千部。印度电影的很大一部分是宝莱坞,这是一个价值数百万美元的产业。因此,本文试图预测即将上映的宝莱坞电影是大片,超级大片,热门大片,普通大片还是失败大片。为此,将应用数据挖掘技术(分类和预测)。数据挖掘是从大型数据集中发现不同模式的过程,并从中发现各种关系,以解决业务中出现的各种问题,并帮助预测即将到来的趋势。分类和预测的主要区别在于分类预测的是确定的分类标签,而预测技术预测的是连续值。每个大企业都参与了大量的数据,所以在分析它们之后,公司可以定义不同的策略来实现业务目标和利润。对于这些,不同的数据可视化工具(如Microsoft Power BI、Tableau、Logi Analytics、Orange等)提供了各种技术。制作分类器或预测模型的第一步是学习阶段,在这个阶段我们需要给训练数据集,通过应用一些技术或算法来训练模型,然后生成不同的规则,这些规则有助于制作模型并预测不同类型组织的未来趋势。所有与分类和预测相关的技术,如决策树,Naïve贝叶斯,逻辑回归,Adaboost和KNN将被应用,并试图找到高效和有效的结果。这些算法将在Orange应用。这个橙色的工具是一个开源软件,它结合了数据挖掘、机器学习和数据可视化,使我们的工作变得简单,一切都在一个地方。所有这些功能都可以应用于基于GUI的工作流,这些工作流具有各种类别,如数据、可视化、模型、评估。本文的重点是比较分析,将基于不同的参数,如准确性,混淆矩阵来确定预测电影成功的最佳模型。这种预测可以帮助制作公司进行广告宣传,他们可以计划他们的成本,并通过确保这些因素,他们可以使电影更有利可图。这样,他们也可以根据模型定义的预测成功率来决定电影的最佳上映时间,从而获得更高的收益。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Comparative Analysis for Prediction of Success of Bollywood Movie
In terms of Film Production, Indian Cinema is one of the largest Film Industry where more than thousands of movies released each year. The large part of Indian Cinema is Bollywood which is multi-million dollar industry. So, this Paper attempts to predict whether the upcoming Bollywood Movie would be Blockbuster, Superhit, Hit, Average or Flop. For this Data Mining techniques (classification and prediction) will be applied. Data Mining is the process of discovering different patterns from large data sets and from that various relationships are also discovered to solve various problems which come in business and helps to predict the forthcoming trends. The main difference between classification and prediction is that the classification predicts the definitive categorical labels and prediction techniques predicts the continuous values. Every big business is engaging a large amount of data so after analyzing them the company can define different strategies to achieve business aims and profits. For these, there are various techniques available with different data visualization tools such as Microsoft Power BI, Tableau, Logi Analytics, Orange, etc. To make classifier or prediction model first step is learning stage in which we need to give the training data set to train the model by applying some technique or algorithm and after that different rules are generated which helps to make a model and predict future trends in different types of organizations. All the techniques related to classification and Prediction such as Decision Tree, Naïve Bayes, Logistic Regression, Adaboost, and KNN will be applied and try to find out efficient and effective results. These Algorithms would be applied in Orange. The orange tool is Open-Source software which incorporates Data Mining, Machine Learning and Data Visualization which makes our work easy and everything in one place. All these functionalities can be applied with GUI Based workflows available with various categories such as data, Visualize, Model, Evaluate. This paper focuses on Comparative Analysis that would be performed based on different parameters such as Accuracy, Confusion Matrix to identify the Best possible model for predicting the movie Success. This Prediction can help Production Houses for Advertisement Propaganda and they can plan their costs and by assuring these factors they can make the movie more profitable. By this, they can also decide as which is the best time to release the movie according to the predicted success rate defined by mode to gain higher benefits.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信