Text mining of crowdfunding data, including numerical metadata

seba77seba77 MemberPosts:2Contributor I
edited January 2020 inHelp

Hi,
I'm trying to analyze crowdfunding datasets for a study project.
The dataset shows in each row -amongst others-, a descriptions of the campaigns and how much money was raised for that campaign.
The goal is now to analyze the total occurence of certain terms in the dataset by using text mining.
基本的文本挖掘过程s is not the problem. So as an example, I found out that the term "android" exists in 150 crowdfunding campaigns.
Now it would be interesting to find out how much money was spent on the campaigns that contain this word.
So, in theory, adding up the numbers from the raised money cell of every campaign that contains the word "android".
The goal is then to get a result like this, (so an additional column that shows the totalmone raised)

word attribute name document occurences total money raised
android android 150 10.000
wordpress wordpress 120 8.000


Is this possible with text mining?
Thank you in advance!

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    Yup, use a Wordlist to Data operator to create a data table, then use a Generate Attributes operator to create a new attribute column named "total money raised" and create a function to generate your $$$.

    Or you can use an Aggregate operator too

    sgenzer
Sign InorRegisterto comment.