Comparing movie perfomance

faizharry4faizharry4 MemberPosts:5Contributor I
edited November 2018 inHelp

hi...im doing a project in rapid miner using search twitter and sentiment analysis...im trying to find a way to prove that marvel movies is better than dc movies and also im trying to extract new attributes from the data that been collected. for example, what kinds of words (common words) that used to describe the avengers. what are the word that used to describe the positive, negative, neutral. so far..i have no idea how to do that...i already collected the data using the seacrh twitter and sentiment analysis...but the later part..is a puzzler...can you please help me

Tagged:

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4that's an interesting problem. It'll be hard comparing sentiment for Spiderman tweets vs Superman tweets. Have you thought about extract the sentiment scores for DC vs Marvel movies and doing a weight rolling average. Like 1000 pos / 20,000 tweet for DC vs 500 pos / 6000 tweets for Marvel, doing it per day and trending it? This way you might be able to see a rate of change before and after a movie is released?

  • faizharry4faizharry4 MemberPosts:5Contributor I

    basically im trying to compare between infinity wars vs justice league....what i have done now is basically retrieving data from twitter using search twitter and then using aylien to analyze sentiment then using data to documents and then use categorize (document) followed by documents to data operator and finally write excel to store the data that being retrieved...so now i have 200 tweets for each movie... and then im stuck for the next move...which is how to compare the two movies....

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4200 tweets for each movie sounds awfully low. Maybe start generating a Wordlist for each movie and see what are the most common words used to describe each movie?

  • faizharry4faizharry4 MemberPosts:5Contributor I

    @Thomas_Ottthe 200 tweet is only for the startup before it being expanded...i will add on no of tweet once i have figured out the soluton...anyway...as you suggested...how to generate a Wordlist for each movie and see what are the most common words used to describe each movie in rapcan we id miner?

    and can we import data directly from metacrtitics, imdb, rotten tomatoes so that i can compare the perfomance of the two movies and then import other data from any website that has the gross of both film?

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4use the Process Documents from Data operator, embed a tokenizer and other text processing operatprs. Then output the WOR port.

    sgenzer
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4also, you can get IMDB and Rotten Tomato info from using the Web Mining extension, you just have to create the process.

    sgenzer
  • faizharry4faizharry4 MemberPosts:5Contributor I

    @Thomas_Ottthanks....i have try to create a process for the word count...but i come to blank...i try to do a word associaton...which word is associated with polarity of positive, negative and neutral but the result is empty1.png2.png3.png

    1.png 0B
    2.png 0B
    3.png 0B
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4you need the Process Documents from Data operator, not Process Documents from Files.

    Also you will need probably use a Nominal to Text conversion operator.

    sgenzer
  • faizharry4faizharry4 MemberPosts:5Contributor I

    @Thomas_Ott我试过其他方法…但似乎我的运气not there...still wont give the result that i want...using sentiment analysis, it categorized the polarity based on the tweet...is it possible to find out the word that being associated with the neutral, positive and negative?4.png5.png6.png

    4.png 0B
    5.png 0B
    6.png 0B
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @faizharry4if you're passing the sentiment into the process documents operator, try setting it as a label role. Or, if you are using the Extract Sentiment operator and set the Vector Creation to Binary Occurances you can output the EXA port and see the sentiment for the tweet ID and what word its attached too/

    像这样:











    <参数r key="attribute_filter_type" value="single"/>
    <参数r key="attribute" value="Text"/>



    <参数r key="vector_creation" value="Binary Term Occurrences"/>
    <参数r key="prune_method" value="percentual"/>




    <参数r key="directory" value="C:\Users\TomOtt\OneDrive\wordnet\WordNet-3.0\dict"/>
















    <参数r key="group_attribute" value="sentiment"/>
    <参数r key="index_attribute" value="Id"/>














Sign InorRegisterto comment.