Apply Join operator on each Attribute
Hello everyone,
As usual, I'm sorry if the question has already been posted before - although I carefully checked...
Here is my issue. I have anExample Set_1containing an attribute calledDate, and a variable number n of attributes calledPeople_1toPeople_n. I also have anotherExample Set_2containing a list of names and biographic datas.
In order to get for each date the biographic datas of each people involved that day, I would like to apply theJoin Operatorto each attributePeople_i(i=1 to n) using their names asKey attributes.
For now, as n is still small (7), I can do it by hand, parametring 7Join Operators for eachPeopleattribute, and joining my 7 resulting sets in theAppend Operator. But, I would like to get the same result automatically, in case of n becomes too big (parametring 40Join Operators and connecting all them to anAppendseems quite heavy and inelegant to me).
I've tried during past week to find an elegant solution to that problem, and I haven't been able to figure it. I've triedLoopsof all kind,Subprocesses, etc. It seems very simple though, but I've been unable to get my "hand" results automatically.
Here are some of my datas to help you.
Thanks a lot, as always :-)
Sylvain
Row | Id | Date | People_1 | People_2 | People_3 | People_4 | People_5 | People_6 | People_7 |
12 | 12.0 | Sat Dec 19 00:00:00 EST 1970 | Loretta Lynn | Owen Bradley | ? | ? | ? | ? | ? |
13 | 13.0 | Sat Jun 07 00:00:00 EDT 1980 | Sissy Spacek | Loretta Lynn | Owen Bradley | ? | ? | ? | ? |
14 | 14.0 | Sat Dec 04 00:00:00 EST 2010 | Loretta Lynn | Sheryl Crow | Miranda Lambert | ? | ? | ? | ? |
15 | 15.0 | Sat Aug 09 00:00:00 EDT 1969 | 查理的骄傲 | Dallas Frazier | A.L. "Doodle" | Chet Atkins | Jack Clement | Bob Ferguson | Felton Jarvis |
Row | People | Birth_date | Birth_state | Sex | Ethnicity |
2 | Charles Kelley | Fri Sep 11 00:00:00 EDT 1981 | GA | Male | Caucasian |
3 | Hillary Scott | Tue Apr 01 00:00:00 EST 1986 | TN | Female | Caucasian |
4 | Reba McEntire | Mon Mar 28 00:00:00 EST 1955 | OK | Female | Caucasian |
5 | Wanda Jackson | Wed Oct 20 00:00:00 EST 1937 | OK | Female | Caucasian |
6 | Carrie Underwood | Thu Mar 10 00:00:00 EST 1983 | OK | Female | Caucasian |
7 | Toby Keith | Sat Jul 08 00:00:00 EDT 1961 | OK | Male | Caucasian |
8 | David Bellamy | Sat Sep 16 00:00:00 EDT 1950 | FL | Male | Caucasian |
9 | Howard Bellamy | Sat Feb 02 00:00:00 EST 1946 | FL | Male | Caucasian |
10 | Keith Urban | Thu Oct 26 00:00:00 EDT 1967 | Northland | Male | Caucasian |
11 | Miranda Lambert | Thu Nov 10 00:00:00 EST 1983 | TX | Female | Caucasian |
12 | Sam Hunt | Sat Dec 08 00:00:00 EST 1984 | GA | Male | Caucasian |
13 | Johnny Cash | Fri Feb 26 00:00:00 EST 1932 | AR | Male | Caucasian |
14 | June Carter | Sun Jun 23 00:00:00 EDT 1929 | VA | Female | Caucasian |
15 | Merle Haggard | Tue Apr 06 00:00:00 EST 1937 | CA | Male | Caucasian |
16 | Waylon Jennings | Tue Jun 15 00:00:00 EDT 1937 | TX | Male | Caucasian |
17 | Willie Nelson | Sat Apr 29 00:00:00 EST 1933 | TX | Male | Caucasian |
18 | Loretta Lynn | Thu Apr 14 00:00:00 EST 1932 | KY | Female | Caucasian |
19 | Sissy Spacek | Sun Dec 25 00:00:00 EST 1949 | TX | Female | Caucasian |
20 | Sheryl Crow | Sun Feb 11 00:00:00 EST 1962 | MO | Female | Caucasian |
21 | 查理的骄傲 | Sun Mar 18 00:00:00 EST 1934 | MS | Male | African American |
22 | Rodney Clawon | ? | TX | Male | Caucasian |
23 | Nathan Chapman | ? | TN | Male | Caucasian |
Best Answer
-
JEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
I'm sure I'm missing something important, but have you tried the De-Pivot operator before the join?
<操作or activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<操作or activated="true" class="operator_toolbox:create_exampleset" compatibility="1.0.000" expanded="true" height="68" name="Create ExampleSet" width="90" x="45" y="34">
<操作or activated="true" class="de_pivot" compatibility="8.2.000" expanded="true" height="82" name="De-Pivot" width="90" x="179" y="34">
<操作or activated="true" class="operator_toolbox:create_exampleset" compatibility="1.0.000" expanded="true" height="68" name="Create ExampleSet (2)" width="90" x="45" y="187">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join" width="90" x="380" y="85">
<连接from_op = " Join" from_port="join" to_port="result 1"/>2
Answers
Hi@SylvainM,
基于2 examplesets的你的文章,你能post a resulting exampleset of what you want to obtain, to better understand.
Regards,
Lionel
Hello Lionel,
Thank you for your help :-)
This is what I get "by hand" (please, consider that my bio datas aren't complete yet, some names are missing, which explains why I don't have a row for each person indicated by my previousExample sets)
Best,
Sylvain
Hi again@SylvainM
Could you share your process ?
Regards,
Lionel
Hi Lionel,
So I don't know if I did it correctly (I'm very bad with computers), but this is the "joining" part of the process. It takes as inputs the intermediary results I gave in my previous post.
Thanks so much!
Sylvain
<操作or activated="true" class="process" compatibility="8.2.000" expanded="true" name="Process">
<操作or activated="true" class="subprocess" compatibility="8.2.000" expanded="true" height="103" name="Subprocess" width="90" x="246" y="34">
<操作or activated="true" class="multiply" compatibility="8.2.000" expanded="true" height="208" name="Multiply (5)" width="90" x="45" y="34"/>
<操作or activated="true" class="multiply" compatibility="8.2.000" expanded="true" height="208" name="Multiply (6)" width="90" x="45" y="340"/>
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (14)" width="90" x="179" y="646">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (13)" width="90" x="179" y="544">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (12)" width="90" x="179" y="442">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (11)" width="90" x="179" y="340">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (10)" width="90" x="179" y="238">
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (8)" width="90" x="179" y="34">
<参数键= " People_1 " value = "人" / >
<操作or activated="true" class="concurrency:join" compatibility="8.2.000" expanded="true" height="82" name="Join (9)" width="90" x="179" y="136">
<操作or activated="true" class="append" compatibility="8.2.000" expanded="true" height="208" name="Append (2)" width="90" x="514" y="136">
<连接from_op = from_port =“把(5)输出1" to_op="Join (8)" to_port="left"/>
<连接from_op = from_port =“把(5)输出2" to_op="Join (9)" to_port="left"/>
<连接from_op = from_port =“把(5)输出3" to_op="Join (10)" to_port="left"/>
<连接from_op = from_port =“把(5)输出4" to_op="Join (11)" to_port="left"/>
<连接from_op = from_port =“把(5)输出5" to_op="Join (12)" to_port="left"/>
<连接from_op = from_port =“把(5)输出6" to_op="Join (13)" to_port="left"/>
<连接from_op = from_port =“把(5)输出7" to_op="Join (14)" to_port="left"/>
<连接from_op = " Join (14)" from_port="join" to_op="Append (2)" to_port="example set 7"/>
<连接from_op = " Join (13)" from_port="join" to_op="Append (2)" to_port="example set 6"/>
<连接from_op = " Join (12)" from_port="join" to_op="Append (2)" to_port="example set 5"/>
<连接from_op = " Join (11)" from_port="join" to_op="Append (2)" to_port="example set 4"/>
<连接from_op = " Join (10)" from_port="join" to_op="Append (2)" to_port="example set 3"/>
<连接from_op = " Join (8)" from_port="join" to_op="Append (2)" to_port="example set 1"/>
<连接from_op = " Join (9)" from_port="join" to_op="Append (2)" to_port="example set 2"/>
<操作or activated="true" class="select_attributes" compatibility="8.2.000" expanded="true" height="82" name="Select Attributes (7)" width="90" x="380" y="34">
And I know : it's super heavy and ugly :smileylol:
:smileysurprised: That's exactly that I was looking for! I had no idea that I could useDe-pivotlike that. Sorry for being such a beginner... :smileyembarrassed:
Thanks so much, JEdward! I deeply appreciate your help!
And the problem is solved, of course.