Apply Join operator on each Attribute

SylvainMSylvainM MemberPosts:18Maven
edited July 2019 inHelp

Hello everyone,

As usual, I'm sorry if the question has already been posted before - although I carefully checked...

Here is my issue. I have anExample Set_1containing an attribute calledDate, and a variable number n of attributes calledPeople_1toPeople_n. I also have anotherExample Set_2containing a list of names and biographic datas.

In order to get for each date the biographic datas of each people involved that day, I would like to apply theJoin Operatorto each attributePeople_i(i=1 to n) using their names asKey attributes.

For now, as n is still small (7), I can do it by hand, parametring 7Join Operators for eachPeopleattribute, and joining my 7 resulting sets in theAppend Operator. But, I would like to get the same result automatically, in case of n becomes too big (parametring 40Join Operators and connecting all them to anAppendseems quite heavy and inelegant to me).

I've tried during past week to find an elegant solution to that problem, and I haven't been able to figure it. I've triedLoopsof all kind,Subprocesses, etc. It seems very simple though, but I've been unable to get my "hand" results automatically.

Here are some of my datas to help you.

Thanks a lot, as always :-)

Sylvain

Row Id Date People_1 People_2 People_3 People_4 People_5 People_6 People_7
12 12.0 Sat Dec 19 00:00:00 EST 1970 Loretta Lynn Owen Bradley ? ? ? ? ?
13 13.0 Sat Jun 07 00:00:00 EDT 1980 Sissy Spacek Loretta Lynn Owen Bradley ? ? ? ?
14 14.0 Sat Dec 04 00:00:00 EST 2010 Loretta Lynn Sheryl Crow Miranda Lambert ? ? ? ?
15 15.0 Sat Aug 09 00:00:00 EDT 1969 查理的骄傲 Dallas Frazier A.L. "Doodle" Chet Atkins Jack Clement Bob Ferguson Felton Jarvis

Row People Birth_date Birth_state Sex Ethnicity
2 Charles Kelley Fri Sep 11 00:00:00 EDT 1981 GA Male Caucasian
3 Hillary Scott Tue Apr 01 00:00:00 EST 1986 TN Female Caucasian
4 Reba McEntire Mon Mar 28 00:00:00 EST 1955 OK Female Caucasian
5 Wanda Jackson Wed Oct 20 00:00:00 EST 1937 OK Female Caucasian
6 Carrie Underwood Thu Mar 10 00:00:00 EST 1983 OK Female Caucasian
7 Toby Keith Sat Jul 08 00:00:00 EDT 1961 OK Male Caucasian
8 David Bellamy Sat Sep 16 00:00:00 EDT 1950 FL Male Caucasian
9 Howard Bellamy Sat Feb 02 00:00:00 EST 1946 FL Male Caucasian
10 Keith Urban Thu Oct 26 00:00:00 EDT 1967 Northland Male Caucasian
11 Miranda Lambert Thu Nov 10 00:00:00 EST 1983 TX Female Caucasian
12 Sam Hunt Sat Dec 08 00:00:00 EST 1984 GA Male Caucasian
13 Johnny Cash Fri Feb 26 00:00:00 EST 1932 AR Male Caucasian
14 June Carter Sun Jun 23 00:00:00 EDT 1929 VA Female Caucasian
15 Merle Haggard Tue Apr 06 00:00:00 EST 1937 CA Male Caucasian
16 Waylon Jennings Tue Jun 15 00:00:00 EDT 1937 TX Male Caucasian
17 Willie Nelson Sat Apr 29 00:00:00 EST 1933 TX Male Caucasian
18 Loretta Lynn Thu Apr 14 00:00:00 EST 1932 KY Female Caucasian
19 Sissy Spacek Sun Dec 25 00:00:00 EST 1949 TX Female Caucasian
20 Sheryl Crow Sun Feb 11 00:00:00 EST 1962 MO Female Caucasian
21 查理的骄傲 Sun Mar 18 00:00:00 EST 1934 MS Male African American
22 Rodney Clawon ? TX Male Caucasian
23 Nathan Chapman ? TN Male Caucasian
Tagged:

Best Answer

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
    Solution Accepted

    @SylvainM

    I'm sure I'm missing something important, but have you tried the De-Pivot operator before the join?







    <操作or activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">

    <操作or activated="true" class="operator_toolbox:create_exampleset" compatibility="1.0.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="34">








    <操作or activated="true" class="de_pivot" compatibility="8.2.000" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="34">





    <操作or activated="true" class="operator_toolbox:create_exampleset" compatibility="1.0.000" expanded="true" height="68" name="Create ExampleSet (2)" width="90" x="45" y="187">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join" width="90" x="380" y="85">








    <连接from_op = " Join" from_port="join" to_port="result 1"/>






    lionelderkrikor SylvainM

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn

    Hi@SylvainM,

    基于2 examplesets的你的文章,你能post a resulting exampleset of what you want to obtain, to better understand.

    Regards,

    Lionel

  • SylvainMSylvainM MemberPosts:18Maven

    Hello Lionel,

    Thank you for your help :-)

    This is what I get "by hand" (please, consider that my bio datas aren't complete yet, some names are missing, which explains why I don't have a row for each person indicated by my previousExample sets)

    Sat Dec 19 00:00:00 EST 1970 KY Female Caucasian
    Sat Jun 07 00:00:00 EDT 1980 TX Female Caucasian
    Sat Jun 07 00:00:00 EDT 1980 KY Female Caucasian
    Sat Dec 04 00:00:00 EST 2010 KY Female Caucasian
    Sat Dec 04 00:00:00 EST 2010 MO Female Caucasian
    Sat Dec 04 00:00:00 EST 2010 TX Female Caucasian
    Sat Aug 09 00:00:00 EDT 1969 MS Male African American

    Best,

    Sylvain

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn

    Hi again@SylvainM

    Could you share your process ?

    Regards,

    Lionel

  • SylvainMSylvainM MemberPosts:18Maven

    Hi Lionel,

    So I don't know if I did it correctly (I'm very bad with computers), but this is the "joining" part of the process. It takes as inputs the intermediary results I gave in my previous post.

    Thanks so much!

    Sylvain







    <操作or activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">







    <操作or activated="true" class="subprocess" compatibility="8.2.000" expanded="true" height="103" name="Subprocess" width="90" x="246" y="34">

    <操作or activated="true" class="multiply" compatibility="8.2.000" expanded="true" height="208" name="Multiply (5)" width="90" x="45" y="34"/>
    <操作or activated="true" class="multiply" compatibility="8.2.000" expanded="true" height="208" name="Multiply (6)" width="90" x="45" y="340"/>
    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (14)" width="90" x="179" y="646">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (13)" width="90" x="179" y="544">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (12)" width="90" x="179" y="442">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (11)" width="90" x="179" y="340">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (10)" width="90" x="179" y="238">








    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (8)" width="90" x="179" y="34">




    <参数键= " People_1 " value = "人" / >



    <操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (9)" width="90" x="179" y="136">








    <操作or activated="true" class="append" compatibility="8.2.000" expanded="true" height="208" name="Append (2)" width="90" x="514" y="136">






    <连接from_op = from_port =“把(5)输出1" to_op="Join (8)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出2" to_op="Join (9)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出3" to_op="Join (10)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出4" to_op="Join (11)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出5" to_op="Join (12)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出6" to_op="Join (13)" to_port="left"/>
    <连接from_op = from_port =“把(5)输出7" to_op="Join (14)" to_port="left"/>







    <连接from_op = " Join (14)" from_port="join" to_op="Append (2)" to_port="example set 7"/>
    <连接from_op = " Join (13)" from_port="join" to_op="Append (2)" to_port="example set 6"/>
    <连接from_op = " Join (12)" from_port="join" to_op="Append (2)" to_port="example set 5"/>
    <连接from_op = " Join (11)" from_port="join" to_op="Append (2)" to_port="example set 4"/>
    <连接from_op = " Join (10)" from_port="join" to_op="Append (2)" to_port="example set 3"/>
    <连接from_op = " Join (8)" from_port="join" to_op="Append (2)" to_port="example set 1"/>
    <连接from_op = " Join (9)" from_port="join" to_op="Append (2)" to_port="example set 2"/>








    <操作or activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes (7)" width="90" x="380" y="34">




















  • SylvainMSylvainM MemberPosts:18Maven

    And I know : it's super heavy and ugly :smileylol:

  • SylvainMSylvainM MemberPosts:18Maven

    :smileysurprised: That's exactly that I was looking for! I had no idea that I could useDe-pivotlike that. Sorry for being such a beginner... :smileyembarrassed:

    Thanks so much, JEdward! I deeply appreciate your help!

    And the problem is solved, of course.

Sign InorRegisterto comment.