How to "loop" over a filter?

eldenosoeldenoso MemberPosts:65Contributor I
edited November 2018 inHelp

Hey everybody,

I have a dataset containing the hours of a day of the whole year. What I want to do is to filter each day. Obviously doing that manually would be very hard, as I had to do that 365 times. Is there a way to somehow loop this thing?

Thanks:)

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist
    Solution Accepted

    I think you need to get the data set into your loop collection using remember recall. See attached process

    With 1-2 more operators we could use a usual loop, with select operator. The standard loop has an additional input and is working in parallel. Quite some options to go there:)

    Best,

    Martin





























    <参数键= " 12 " value = " TagdW.true.polynominal.attribute"/>
















    <参数键=“网格”值= "如果(Time> 0和3;Time<=15,15, if(Time>15&&Time<=30,30, if(Time>30&&Time<=45,45, if(Time>45&&Time<=60,60, if(Time>60&&Time<=75,75, if(Time>75&&Time<=90,90, if(Time>90&&Time<=105,105, if(Time>105&&Time<=120,120, if(Time>120&&Time<=135,135, if(Time>135&&Time<=150,150, if(Time>150&&Time<=165,165, if(Time>165&&Time<=180,180, if(Time>180&&Time<=195,195, if(Time>195&&Time<=210,210, if(Time>210&&Time<=225,225, if(Time>225&&Time<=240,240, if(Time>240&&Time<=255,255, if(Time>255&&Time<=270,270, if(Time>270&&Time<=285,285, if(Time>285&&Time<=300,300, if(Time>300&&Time<=315,315, if(Time>315&&Time<=330,330, if(Time>330&&Time<=345,345, if(Time>345&&Time<=360,360, if(Time>360&&Time<=375,375, if(Time>375&&Time<=390,390, if(Time>390&&Time<=405,405, if(Time>405&&Time<=420,420, if(Time>420&&Time<=435,435, if(Time>435&&Time<=450,450, if(Time>450&&Time<=465,465, if(Time>465&&Time<=480,480, if(Time>480&&Time<=495,495, if(Time>495&&Time<=510,510, if(Time>510&&Time<=525,525, if(Time>525&&Time<=540,540, if(Time>540&&Time<=555,555, if(Time>555&&Time<=570,570, if(Time>570&&Time<=585,585, if(Time>585&&Time<=600,600, if(Time>600&&Time<=615,615, if(Time>615&&Time<=630,630, if(Time>630&&Time<=645,645, if(Time>645&&Time<=660,660, if(Time>660&&Time<=675,675, if(Time>675&&Time<=690,690, if(Time>690&&Time<=705,705, if(Time>705&&Time<=720,720, if(Time>720&&Time<=735,735, if(Time>735&&Time<=750,750, if(Time>750&&Time<=765,765, if(Time>765&&Time<=780,780, if(Time>780&&Time<=795,795, if(Time>795&&Time<=810,810, if(Time>810&&Time<=825,825, if(Time>825&&Time<=840,840, if(Time>840&&Time<=855,855, if(Time>855&&Time<=870,870, if(Time>870&&Time<=885,885, if(Time>885&&Time<=900,900, if(Time>900&&Time<=915,915, if(Time>915&&Time<=930,930, if(Time>930&&Time<=945,945, if(Time>945&&Time<=960,960, if(Time>960&&Time<=975,975, if(Time>975&&Time<=990,990, if(Time>990&&Time<=1005,1005, if(Time>1005&&Time<=1020,1020, if(Time>1020&&Time<=1035,1035, if(Time>1035&&Time<=1050,1050, if(Time>1050&&Time<=1065,1065, if(Time>1065&&Time<=1080,1080, if(Time>1080&&Time<=1095,1095, if(Time>1095&&Time<=1110,1110, if(Time>1110&&Time<=1125,1125, if(Time>1125&&Time<=1140,1140, if(Time>1140&&Time<=1155,1155, if(Time>1155&&Time<=1170,1170, if(Time>1170&&Time<=1185,1185, if(Time>1185&&Time<=1200,1200, if(Time>1200&&Time<=1215,1215, if(Time>1215&&Time<=1230,1230, if(Time>1230&&Time<=1245,1245, if(Time>1245&&Time<=1260,1260, if(Time>1260&&Time<=1275,1275, if(Time>1275&&Time<=1290,1290, if(Time>1290&&Time<=1305,1305, if(Time>1305&&Time<=1320,1320, if(Time>1320&&Time<=1335,1335, if(Time>1335&&Time<=1350,1350, if(Time>1350&&Time<=1365,1365, if(Time>1365&&Time<=1380,1380, if(Time>1380&&Time<=1395,1395, if(Time>1395&&Time<=1410,1410, if(Time>1410&&Time<=1425,1425, if(Time>1425&&Time<=1440,1440,666)))) ))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))))"/>





































    Just to ensure execution order



    <操作符=“false”class = " operator_toolbo激活x:group_into_collection" compatibility="0.1.000" expanded="true" height="82" name="Group Into Collection" width="90" x="112" y="238">








































    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
    Thomas_Ott

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Hey,

    loop values would do the job. Maybe our new Group Into Collection operator from the Operator Toolbox is even better, it gives you a collection with an example set per day. You can work with Loop Collection trhough the days.

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Thanks for your reply,

    that sounds pretty good. But could you specify? I downloaded the operator toolbox, but as soon as I put the Group into Collection operator into the Loop Collection operator the Error Message "Expected IOObjectCollection but received Examples set" occurs. Since I would call myself a newbie I would be grateful if you could provide me how to do so :-).

    Regards
    Philipp

  • eldenosoeldenoso MemberPosts:65Contributor I

    And besides that is it possible to group by 2 attributes?

    _______________________________________________
    Okay, I solved this by putting another Group into Collection operator into the loop collection?! Now the problem occured that I can't join a collection with another dataset?

    Thomas_Ott
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Hi,

    i currently cannot run you proces, but i think you need to use an append before the join to get an example set again.

    Edit: For the two attributes. Thats on our list to add. The Toolbox extension is a community like extension, even tough it is a rapidminer-interal community:)。到目前为止,您需要生成属性和Concat to do two attributes.

    Loop Values with Filter Example is by the way also a viable option, but slightly slower in execution time.

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Oh, okay. That's maybe because I have so many dataset etc.

    But if I append now I have the same result as before. What I whant to do is to join every collection in this case e.g. 365 with another example set (which contains e.g. the name of the days of the week). So to append wouldn't be an option or?

  • eldenosoeldenoso MemberPosts:65Contributor I

    I think we are near the finish line. Thank you for your process, that looks like it can work. But there is one error message occuring in the recall "no object with name data was found" despite we set it "data" in remember operator.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Could you check if the remember operator is executed before the recall?

    See:http://community.www.turtlecreekpls.com/t5/RapidMiner-Studio-Knowledge-Base/Change-the-Execution-Order-of-Processes/ta-p/31780

    Best,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Thanks for your fast response.

    According to RapidMiner it is definitely executed before the recall.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Got it, Could you remove theremove from storeoption in recall. Otherwise it's not available in iteration 2. Sorry for this.

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    If I "remove from store" to negative it works :-). Is that plausible?

  • eldenosoeldenoso MemberPosts:65Contributor I

    Okay, I did it parallel. Thank you very much for this long discussion and helpful answers! Process works fine now :smileyhappy:

    Best regards

    Philipp

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Yes,

    usually the objects are deleted once you recall them. This is to safe memory. In your special case you do not want to have it deleted. if you deactivate this option it's deleted once your process finishes.

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Hello again,

    I have a question concerning the metadata, because if I want to apply replace missing values (series) on each IOObject I can't pick them in the dropdown of the operator. :smileyfrustrated:

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Hi,

    you an simply type in the attributes by hand. It works anyway.

    I think we need to investigate our meta data propagation there. But maybe it's just fine to take the meta data from Last execution (under Process).

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Thank you that also worked! :-)

    Now that I wanted to do two collections (2 attributes) I created another collection of the collection. The Input of the join operator then says that it's the wrong input type.

    Thomas_Ott
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Hi,

    do you want to group by two attributes? If so, then first built an indicator variable like concat(att1,att2) and then do one grouping.

    ~Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • eldenosoeldenoso MemberPosts:65Contributor I

    Okay that worked. Thank you! I think it's the routine, which hopefully lets me find this kind of solutions, too.

    The whole process is finished now. It works fine, but is would there be a way to create or rather get back the meta data? It took me some typing to manually write all attribute names in a couple of different operators.

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,302RM Data Scientist

    Hi,

    usually Process->Synchronize Data with Real Data should do the job.

    Propagating meta data from recalls in complex loops is kind of difficult..

    Best,

    Martin

    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
Sign InorRegisterto comment.