Merging double attributes and it's examples together

eldenosoeldenoso MemberPosts:65Contributor I
edited December 2019 inHelp

Hello,

I am pretty new to RapidMiner and thus can't find a solution the problem I have.
我的示例集包含客户ID,预订是的rs and the booked hotel. Because within each year some of the customers are going on holiday twice or more, there are double ID's occuring. Is it possible with RapidMiner to somehow merge the belonging booked hotels of the ID's together in one example? To make it more cleary I give you an example of what I mean:

Raw Data Year 20XX:
ID BookedHotel

12 Laplaza

13 Greengarden

12 Ocean

15 Laplaza

Now the customer with the ID 12 is going on holiday twice this year. One time to Laplaza Hotel and the other time to the Ocean Hotel. Now what I want to achieve should look like this:

ID BookedHotel

12 Laplaza; Ocean

13 Greengarden

15 Laplaza

So that if a customer books twice or more a year, the hotels are seperated by a semicolon in the same "cell". I already tried to achieve this by Pivot or generate Concatenation, but without success.

Thank you for your help and sorry for any mistakes (I'm german).


Tagged:

Best Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted

    Hi eldenoso,

    Aggregate is doing the job. concat(hotel) and group_by id. The default delimiter is | but you can of course replace it with a Replace operator.

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    IngoRM zprekopcsak
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted

    Hi Ingo,

    you can use this concat also for some fancy ticks. Since | is the or in regex you can extract concat(att) into a macro and use it in Select Attributes to select these attributes or in Filter Examples with a matches expression.

    Kudos to@hhomburgfor this trick.

    ~Martin

    Example Process:































    Create a table of attributes to keep, can be stored in repo or taken from a DB



    <参数键=“保持" value = "连接" / >














    Ensure Execution Order




    The magic happens here!













    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    IngoRM

Answers

  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder

    我必须承认,我非常深刻的印象。我没有know about the concat function in the "Aggregate" operator and actually started to build a workflow of at least 10 operators to solve this when I saw your post. This works like a charm and is so much more elegant!

    Here is a small example process showing how this works.

    Cheers,

    Ingo






























































  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder

    Nice one indeed :smileyvery-happy:

  • eldenosoeldenoso MemberPosts:65Contributor I

    Thank you all for your help. The aggregation solution actually worked pretty good for my case :smileyvery-happy:

Sign InorRegisterto comment.