"Controlling loops (break/continue)"

colocolo MemberPosts:236Maven
edited May 2019 inHelp
Hi,

I experienced that the clear view of a process can quickly be lost if there are some nested loops or branches. In some cases I would have been happy if the "Branch" operator was a simple one instead of a super operator, delivering the input data either to a 'then' or an 'else' output port. This way you wouldn't have the chance of combining or delivering modified input data for each port, but in most cases this wasn't necessary for me. This could be a simple and clear switch for a different flow of data dependent on the condition and perhaps an alternative to the usual "Branch" operator for simple decisions. But this was just a thought coming to my mind a few times during process design...

My actual question is something different: I don't know how the different loops are internally translated into Java code, but they should make use of one of the language's standard methods, I guess. Is there a way of controlling loops in RapidMiner by calls as they are possible in Java (continue/break)? Or does this conflict with the process structure of RapidMiner? Before trying to add some (hopefully simple) operators for these tasks I wanted to make sure if it's possible at all.
Maybe you also have some alternative suggestions. In the current case I am using "Loop examples" on a list of URLs, retrieve each page via "Get Page" and then follows some information extraction. I already had to add one "Branch" after the "Get Page" to avoid that the process fails if a single page wasn't retrieved properly (due to connection problems or something else). Now there are some cases that make the following XPath interpreter abort the process due to invalid XHTML code. In this case the page doesn't contain useful information and the the current loop iteration can stop at this point. Instead of using another super operator and putting the major part of the process inside it, I would prefer a simple single operator or something similar to skip/stop the iteration without any result. If I am just eliminating the error sources (as I did for now) this results in mostly empty examples that have to be filtered out later.
I hope my idea and question becomes understandable, but perhaps I am just thinking into the wrong direction and someone wants to point me towards a proper solution;)

Best regards,
Matthias

P.S. The only related question I found was in an older topic (http://rapid-i.com/rapidforum/index.php/topic,892.0.html)which didn't provide a real solution.

Answers

  • dragoljubdragoljub MemberPosts:241Maven
    I too am wondering this. I just came across a situation where I would love to have a loop continue on a specific exception. Anyone know how to get continue to work within a Loop?
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    well, there's this nice operator "Handle Exception" which will catch the exception of it's inner operators and then let the loop continue to work. All you have to to is to ensure the outputs are always suitable for the following operators.

    Of course the Java Controll Structures are used for implementing the loop. But it isn't as simple to trap out of this loops as writing a script with "break". This won't work, you are in a completely different class's method then!
    Coincidently we discussed this problem here and I think the better place to solve this hole issue is to change the behavior of the getPage operator.
    What do you think?

    Greetings,
    Sebastian
Sign InorRegisterto comment.