Comparing movie perfomance
faizharry4
MemberPosts:5Contributor I
hi...im doing a project in rapid miner using search twitter and sentiment analysis...im trying to find a way to prove that marvel movies is better than dc movies and also im trying to extract new attributes from the data that been collected. for example, what kinds of words (common words) that used to describe the avengers. what are the word that used to describe the positive, negative, neutral. so far..i have no idea how to do that...i already collected the data using the seacrh twitter and sentiment analysis...but the later part..is a puzzler...can you please help me
Tagged:
0
Answers
@faizharry4that's an interesting problem. It'll be hard comparing sentiment for Spiderman tweets vs Superman tweets. Have you thought about extract the sentiment scores for DC vs Marvel movies and doing a weight rolling average. Like 1000 pos / 20,000 tweet for DC vs 500 pos / 6000 tweets for Marvel, doing it per day and trending it? This way you might be able to see a rate of change before and after a movie is released?
basically im trying to compare between infinity wars vs justice league....what i have done now is basically retrieving data from twitter using search twitter and then using aylien to analyze sentiment then using data to documents and then use categorize (document) followed by documents to data operator and finally write excel to store the data that being retrieved...so now i have 200 tweets for each movie... and then im stuck for the next move...which is how to compare the two movies....
@faizharry4200 tweets for each movie sounds awfully low. Maybe start generating a Wordlist for each movie and see what are the most common words used to describe each movie?
@Thomas_Ottthe 200 tweet is only for the startup before it being expanded...i will add on no of tweet once i have figured out the soluton...anyway...as you suggested...how to generate a Wordlist for each movie and see what are the most common words used to describe each movie in rapcan we id miner?
and can we import data directly from metacrtitics, imdb, rotten tomatoes so that i can compare the perfomance of the two movies and then import other data from any website that has the gross of both film?
@faizharry4use the Process Documents from Data operator, embed a tokenizer and other text processing operatprs. Then output the WOR port.
@faizharry4also, you can get IMDB and Rotten Tomato info from using the Web Mining extension, you just have to create the process.
@Thomas_Ottthanks....i have try to create a process for the word count...but i come to blank...i try to do a word associaton...which word is associated with polarity of positive, negative and neutral but the result is empty
@faizharry4you need the Process Documents from Data operator, not Process Documents from Files.
Also you will need probably use a Nominal to Text conversion operator.
@Thomas_Ott我试过其他方法…但似乎我的运气not there...still wont give the result that i want...using sentiment analysis, it categorized the polarity based on the tweet...is it possible to find out the word that being associated with the neutral, positive and negative?
@faizharry4if you're passing the sentiment into the process documents operator, try setting it as a label role. Or, if you are using the Extract Sentiment operator and set the Vector Creation to Binary Occurances you can output the EXA port and see the sentiment for the tweet ID and what word its attached too/
像这样: