"Problem with Loop Operator"

Stefan_EStefan_E MemberPosts:53Guru
edited June 2019 inHelp
I do the following:








<过程扩展= " true "高度= " 521" width="681">











































<连接from_op = from_port =“过滤器的例子(3)example set output" to_op="Append" to_port="example set 2"/>



















This creates two IO collections, one for the model, one for the example set.
However, both models and both example sets, corresponding to the two iterations of the loop look exactly identical.

If I roll-up this process explicitly laying out the two iterations, such as here, this creates the expected results.

Hence, I must conclude that the Loop operator is broken? >:(



























































































<连接from_op = from_port =“过滤器的例子(3)example set output" to_op="Append (3)" to_port="example set 1"/>

<连接from_op = from_port“选择属性(3)”="example set output" to_port="result 4"/>









Regards Stefan

Answers

  • cherokeecherokee MemberPosts:82Guru
    Hi Stefan_E,

    as far as I remember the Loop Operator worked for me. And we cannot check your process setup as something important is missing -- the data!

    Best regards,
    chero
  • haddockhaddock MemberPosts:849Maven
    Hi there,

    Like Chero, I'm OK with the loop operator in most of its guiseshttp://rapid-i.com/rapidforum/index.php/topic,2251.msg9179.html#msg9179. As I remember it, finding that 'append' flattens collections helped. I ran your code with data generators and have to ask what you are trying to achieve with it.
  • Stefan_EStefan_E MemberPosts:53Guru
    Thanks for your answers. Not yet helping though, as I still don't understand why the two processes should give different results. Also not sure what Haddock wants to say with the remark on 'append'?

    What I want to do? Similar to a boosted decision stump learner:
    • I want to separate the data set in a tree like fashion but want to make sure that each attribute is only used twice at max, that is with a single upper and single lower bound for separation.
    • in the process I accept mis-classified good examples but want to eventually find all bad examples
    • hence, after each iteration, I build a new data set consisting of all mis-classified bad examples and all correctly classified good examples, then apply Decision Stump anew.
    So far, results look pretty good - but unfortunatly, I had to roll-up the loop into separate sub-processes which I then instantiate many times with 'Execute Process'.

    Stefan
  • cherokeecherokee MemberPosts:82Guru
    Soooo,

    we have the typical case of documentation not matching code. The Loop operator does not deliver its output as new input for the next iteration. It just runs n times on the original input and collects the output.

    So the solution to your problem is not trivial. Try using the operators Remember and Recall within the loop operator and do not use any input directly.

    Best regards,
    chero
  • haddockhaddock MemberPosts:849Maven
    Greets,

    Chero绝对是正确的,得到了集合each pass, in order to process them as one example set you need to use the append operator ( which you'd expect nearby on the menu ).

  • Stefan_EStefan_E MemberPosts:53Guru
    cherokee wrote:

    we have the typical case of documentation not matching code. The Loop operator does not deliver its output as new input for the next iteration. It just runs n times on the original input and collects the output.
    Hi Chero,

    you hit the nail on the head... It works perfect with Remember/Recall.

    I hope the RapidMiners read here and work on the documentation. It neither helps if it's not matching the code nor if it's trivial - as it most of the time is ... (eg: "Minimal size for split - the minimal size of a node in order to allow a split ::) )

    Stefan
Sign InorRegisterto comment.