Merging double attributes and it's examples together
Hello,
I am pretty new to RapidMiner and thus can't find a solution the problem I have.
我的示例集包含客户ID,预订是的rs and the booked hotel. Because within each year some of the customers are going on holiday twice or more, there are double ID's occuring. Is it possible with RapidMiner to somehow merge the belonging booked hotels of the ID's together in one example? To make it more cleary I give you an example of what I mean:
Raw Data Year 20XX:
ID BookedHotel
12 Laplaza
13 Greengarden
12 Ocean
15 Laplaza
Now the customer with the ID 12 is going on holiday twice this year. One time to Laplaza Hotel and the other time to the Ocean Hotel. Now what I want to achieve should look like this:
ID BookedHotel
12 Laplaza; Ocean
13 Greengarden
15 Laplaza
So that if a customer books twice or more a year, the hotels are seperated by a semicolon in the same "cell". I already tried to achieve this by Pivot or generate Concatenation, but without success.
Thank you for your help and sorry for any mistakes (I'm german).
Best Answers
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
-
MartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
Hi Ingo,
you can use this concat also for some fancy ticks. Since | is the or in regex you can extract concat(att) into a macro and use it in Select Attributes to select these attributes or in Filter Examples with a matches expression.
Kudos to@hhomburgfor this trick.
~Martin
Example Process:
Create a table of attributes to keep, can be stored in repo or taken from a DB
<参数键=“保持" value = "连接" / >Ensure Execution Order The magic happens here! - Sr. Director Data Solutions, Altair RapidMiner -
Dortmund, Germany1
Answers
我必须承认,我非常深刻的印象。我没有know about the concat function in the "Aggregate" operator and actually started to build a workflow of at least 10 operators to solve this when I saw your post. This works like a charm and is so much more elegant!
Here is a small example process showing how this works.
Cheers,
Ingo
Nice one indeed :smileyvery-happy:
Thank you all for your help. The aggregation solution actually worked pretty good for my case :smileyvery-happy: