Friday 11 March 2016

Predicting the Future With Social Media

                    Predicting the Future With Social Media

Introduction:
Social media are computer-mediated tools that allow people or companies to create, share, or exchange information, career interests, ideas, and pictures/videos in virtual communities and networks. Social media is collection of information so we discovered that the chatter of a community can indeed be used to make quantitative predictions that outperform those of artificial markets.
We consider the task of predicting box-office revenues for movies using the chatter from Twitter. So First ,we assess how activity and attention is created for different movies and how that changes over time. Our hypothesis is that movies that are well talked about will be well-watched.
In this way we know how sentiments are created. How positive and negative opinions influence people. So we perform sentiment analysis on the data, using text classifiers to distinguish positively oriented tweets from negative.

Dataset Characteristic:
The dataset that we used was obtained by crawling a regular feed of data from Twitter.com.
Capture.PNGCapture1.PNG
Attention  and Popularity:
Our aim is to observe if the knowledge that can be extracted from the tweets can lead to reasonably accurate prediction of future outcomes in the real world.

Effect of Promotional Material : Urls and Retweets:
Before the release of movie generate promotional information in the form of trailer , news. First ,we examine the distribution of tweets for different movie then we examine their correlation with the performance of the movies.

Rate of Tweet Mentions:
Capture4.PNG
We find the tweet-rate , as the number of tweets referring to a particular movie per hour. Our initial analysis of the correlation of the average tweet-rate with the box-office , this suggests a correlation coefficient value  then we construct a linear regression model using least squares of the average of all tweets for all movie over the prior release week.

the box-office revenue values as the dependent variable and the tweet-rate as the explanatory variable as:
                           Rev(mov) = β0 + β1 ∗ Tweet-rate(mov)
Where, β: regression coefficients

Comparison with HSX:
Artificial online markets such as the Foresight Exchange and the Hollywood Stock Exchange are good indicators of future outcomes .The prices in these markets have been shown to have strong correlations with observed outcome frequencies. So we compare with our tweet-rate predictor, we considered regression on the movie  stock prices from the Hollywood Stock Exchange, which can be considered the gold standard.
Capture6.PNG
Sentiment Analysis:
We use the DynamicLMClassifier which is a language model classifier that accepts training events of categorized character sequences. Training is based on a multivariate estimator for the category distribution and dynamic language models for the per-category character sequence estimators.

Subjectivity:
When we computed the subjectivity values for all the movies, we observed that our hypothesis was true. There were more sentiments discovered in tweets for the weeks after release, than in the pre-release week.
Capture7.PNG         9.PNG

Polarity:
Capture8.PNG10.PNG

Conclusion:
social media can be utilized to forecast future outcomes. Specifically, using the rate of chatter from almost 3 million tweets from the popular site Twitter, we constructed a linear regression model for predicting box-office revenues of movies in advance of their
Release. Then we proved that the results outperformed in accuracy those of the Hollywood Stock Exchange and that there is a strong correlation between the amount of attention a given topic has (in this case a forthcoming movie) and its ranking in the future.

References:
Asur, Sitaram, and Bernardo A. Huberman. "Predicting the future with social media." Web Intelligence and Intelligent Agent Technology (WI-IAT), 2010 IEEE/WIC/ACM International Conference on. Vol. 1. IEEE, 2010.



1 comment:

  1. This is actually done today (to predict US elections for example)

    ReplyDelete