How can i make RapidMiner do sentiment analysis of Emoji ?
I have an excel sheet contain Tweets with Emoji, I need to Sentiment Analysis the Emoji alone without the tweets and make sure RapidMiner understands the emoji.
Tagged:
0
Best Answers
-
BalazsBarany Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified ExpertPosts:953UnicornHi@Moha,
you can use the Replace operator with regular expressions like "[a-zA-Z .:!?-]" (search term) and "" (replacement) to get rid of everything but the emojis in the tweets. However, in my tests, RapidMiner doesn't display the emojis.
It also won't just "understand" the emoji.
Sentiment Analysis is a classification problem, you need some labeled examples (categorized into positive or negative). However, this won't work for just one character. If you have tweets with multiple emojis that you can categorize into positive or negative, you could build a model assuming that these emojis match the entire tweet's sentiment. Using the Text Processing extension, you would tokenize into single characters.
If you actually look at the visual representation of the emoji, the effort will be much higher as you're changing this to an image classification problem. RapidMiner can only support parts of that process.
Regards,
Balázs
1 -
sgenzer Administrator, Moderator, Employee, RapidMiner Certified Analyst, Community Manager, Member, University Professor, PM ModeratorPosts:2,959Community Manager嗨@Mohaso just to add what my friend@BalazsBaranysays...it would be my recommendation to do this:
- Use Replace with RegEx as@BalazsBaranyexplains above to get rid of all other text besides the emojis.
- Use the "Encode URL" operator in the Web Mining extension to convert your emojis to UTF-8
- Create a lookup table that has classifications for emojis as you desire (it is very subjective but perhapsis positive,is negative, etc... I'd keep it simply pos/neg for starters.
- Join the lookup table with your data set to classify + create a training set
- Perform normal classification data science (you may want to start with Auto Model for a quick prototype)
Scott
5