Sentiment Analysis Vader Normalization of results

LaraNeuLaraNeu MemberPosts:4Learner I
Hi RapidMiner Community,

I have a question about the results of my sentiment analysis I conducted for online reviews from Airbnb that I want to compare with ratings given from the reviewers. As a result of the Extract Sentiment operator using Vader I get unstandardized values that I can hardly compare with other values like ratings (am I correct).
In anotherpostin the forum by@mschmitzI found two formulas to normalize the sentiment scores and get results between -1 and +1:
- Sentiment Score/Total tokens
——情绪得分/(总令牌-发现了一口烟)

I used both formulas for my data and obviously got two different results. Can someone help me with the interpretation of the results for both formulas? What is the 'right' formula? What is the best way to standardize my data to be able to compare it with the ratings?

Thanks a lot for your help! Really appreciate it!

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    i think there is no real 'correct'. The question is how to normalize the score correctly. I think all three approaches (incl. don't normalize at all) are valid ones.
    Why don't you just calculate the correlation to the stars or a metric like information gain to see whats best?

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
Sign InorRegisterto comment.