how sentiment analysis by python or R

student_compute · August 2018

Hello
I want to make some tweets by Python or R sentiment Analyze .
I did preprocessing in my rapidminer program.
But I do not know how to use R or python to sentiment Analyze in the program?
Someone knows How? Or is there an example?
Any help is helpful to me.
Thanks in advance

kayman · August 2018

I like to use the Vader sentiment part of the NLTK toolkit. It works pretty well with social data (sentiment analysis will always remain a bit of a challenge) and gives a bit more than the usual possitive / negative indications

Attached sample uses this framework, the example chops the response by sentence and gives the 'vibe' per sentence. I typically use this method to ensure also mixed data get's covered well. But of course you could also use it on the full data.

What I provided was like this ;

Review.Body	Review.ID	Review.Date	Review.Title	Review.Rating
Sound is great. Picture is bad	XYZ123	Wed Aug 01 10:08:34 CEST 2018	My opinion	3.0

What it returns is as follows :

Review.ID	sentence	compound	negative	possitive	neutral	Review.Date	Review.Title	Review.Rating
XYZ123	Sound is great.	0.6249	0.0	0.672	0.328	Wed Aug 01 10:08:34 CEST 2018	My opinion	3.0
XYZ123	Picture is bad	-0.5423	0.636	0.0	0.364	Wed Aug 01 10:08:34 CEST 2018	My opinion	3.0

The more negative or possitive the compound value (range -1 to +1), the more likely it will be that the sentiment of a given sentence is equally negative or possitive





























define review as id





Keep only the ones we need<br><br>We focus only on body but we could also concatenate with the title





Get all other fields






replace linebreaks as python doesn't like these too much













Normalization and preparation


<参数键=“脚本”值= "进口熊猫# 10;from nltk.sentiment.vader import SentimentIntensityAnalyzer
from nltk import tokenize

def rm_main(data):

 sent_all=[]
 score_all=[]
 id_all=[]
 
 data_split = pandas.DataFrame()
 for index,row in data.iterrows():
 review=row["Review.Body"]
 _id=row["Review.ID"]
 
 lines_list = tokenize.sent_tokenize(review)
 sid = SentimentIntensityAnalyzer()
 for sentence in lines_list:

 ss = sid.polarity_scores(sentence)
 id_all.append(_id)
 sent_all.append(sentence)
 score_all.append(ss)
 
 data_split['Review.ID']=id_all
 data_split['sentence']=sent_all
 data_split['scores']=score_all

 #print(data)
 
 return data_split"/>
We use nltk / vader framework to do sentiment analysis.<br><br>Can be easily replaced with other frameworks or custom code






<参数键=“负面”值= "解析(replaceAll ([scores],"^.*?'neg': (-?\\d+.\\d+).*$","$1"))"/>





























post processing










http://www.nltk.org/_modules/nltk/sentiment/vader.html<br><br>http://t-redactyl.io/blog/2017/04/using-vader-to-handle-sentiment-analysis-with-social-media-text.html;

lionelderkrikor · August 2018

Hi@student_compute,

In addition to the solution of@kayman, I propose a Python script using the "textblob" library.

From your text attribute, this script delivers a polarity between -1 and +1 where :

-1 (negative) < polarity < +1 (positive).

To execute this script, you have to set the name of your text attribute (with quotes) in theSet Macrosoperator :

The process :





















<参数键=“脚本”值= "进口熊猫# 10;from textblob import TextBlob

textAtt = %{textAttribute}

# rm_main is a mandatory function, 
# the number of arguments has to be the number of input ports (can be none)
def sent(text) : 

 testimonial = TextBlob(str(text))
 sentiment = testimonial.sentiment.polarity 
 return sentiment


def rm_main(data):
 
 data['polarity'] =data[textAtt].apply(sent)
 return data "/>

Regards,

Lionel

student_compute · August 2018

Hi, thank you very much
How to get on points, in a new column. Insert a positive word and a negative word in front of each sentence?
Thanks a lot

lionelderkrikor · August 2018

Hi@student_compute,

I think you can useGenerate AttributesandSet Dataoperators and eventually if neededReorder Attributesoperator.

Regards,

Lionel

student_compute · August 2018

Hello
Thank you so much
I used
But I do not know how to show the polarity of sentences based on scores
look
۱.JPG
Could this be the case?
Thanks

lionelderkrikor · August 2018

Hi@student_compute,

You can, indeed, create a new attribute "Pol" defined by (for example) :

- if -1 < polarity < -0,1, then Pol = "negative"

- if -0,1 <= polarity <= 0,1, then Pol = "neutral"

- if 0,1 < polarity < 1, then Pol = "positive"

Note :You can, of, course, choose and set other thresholds than -0,1 / 0,1.

Here the associated process :





















<参数键=“脚本”值= "进口熊猫# 10;from textblob import TextBlob

textAtt = %{textAttribute}

# rm_main is a mandatory function, 
# the number of arguments has to be the number of input ports (can be none)
def sent(text) : 

 testimonial = TextBlob(str(text))
 sentiment = testimonial.sentiment.polarity 
 return sentiment


def rm_main(data):
 
 data['polarity'] =data[textAtt].apply(sent)
 return data "/>

Regards,

Lionel

student_compute · August 2018

Thank you so much
How to download rapidminer version 9?
Thanks

lionelderkrikor · August 2018

Hi@student_compute,

The link to download RapidMiner 9.0 Beta :

http://static.www.turtlecreekpls.com/rnd/html/rapidminer-9.0-preview.html

Regards,

Lionel

student_compute · August 2018

Hello
Thank you so much
Is there a perpelexity parameter in the new version for LDA? Or more facilities?

student_compute · August 2018

Hello@keyman
I've used your code. But he did not know the package NLTK
How do I download this package and introduce RapidMiner
I use Anacanda. I installed the textblob package but I can not package it
May I help how to do to install?

Thank you

lionelderkrikor · August 2018

Hi@student_compute,

Yes, there is Perplexity as one of performance measure in the last version of LDA.

Regards,

Lionel

student_compute · August 2018

Hello @ lionelderkrikor dear
Thank you

------------------

Hello@keyman
I've used your code. But he did not know the package NLTK
How do I download this package and introduce RapidMiner
I use Anacanda. I installed the textblob package but I can not package it
May I help how to do to install?

Thank you

student_compute · August 2018

Hello

How to install nltk package and use it? The program has an error that this package does not exist !! Thankful

And

I downloaded and run RapidMiner 9. But I do not know how to find Perplexity mesure for assessing LDA? Does anyone know?

Thank

lionelderkrikor · August 2018

Hi@student_compute,

"How to install nltk package and use it?"

开展窗口“邀请德对”(“cmd”型in the search bar of Windows 10) and type de following command :pip install nltk

"But I do not know how to find Perplexity mesure for assessing LDA"

连接theperoutput port ofLDAoperator to theresport

Regards,

Lionel

student_compute · August 2018

Hello
Thank you
Thank you

.

.
Excuse me about perpelexity in the LDA may I send a sample shot screenshot?
Thankful

lionelderkrikor · August 2018

Hi,

Here the screenshots relativ to LDA :

Regards,

Lionel

student_compute · August 2018

Hello

Thank you so much

Just might say
What other things do they use?
I mean avgs ???

student_compute · August 2018

Hello Sorry, I raised the topic again I tried a lot. Do nltk I installed it. But there is an error in the run. Which I myself could not solve. Can anyone help me? And about The amounts of AVGs reported on the LDA output can be explained to me. What is their use? Thanks for all your help

lionelderkrikor · August 2018

Hi@student_compute,

Can you share your process in order we can reproduce your bug ?

Try to add in the Python script after the othersnltk.download('xxxxxx') :

nltk.download('vader_lexicon')

and execute the process one time.

Regards,

Lionel

Howdy, Stranger!

Quick Links

Categories

RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

how sentiment analysis by python or R

Answers