"Parallel processing on X-validation"

韦塞尔韦塞尔 MemberPosts:537Guru
edited June 2019 inHelp
Parallel processing,

When using X-validation and the RM parallel processing plug in installed,
is RM evaluating classifier performance on each X-fold in parallel?

Best regards,



  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi Wessel,
    not automatically. But there exists a X-Validation (parallel) operator that will use multiple threads for computing it parallelly. Keep in mind, that it will use more memory, since each operator is executed parallely and each instance will create its data structures like for example the kernel cache of SVMs. But from my experience, you will recieve an increase of speed thats nearly linear in the number of cores/threads used.

  • dragoljubdragoljub MemberPosts:241Maven
    I have also noticed a check box for parallelizing training/testing.

    When this is checked I experience synchronization problems. Did you guys actually implement some parallel smo algorithms for training? For testing it makes since since data can be split up and classified in parallel.

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    this check box is generally available on all sub process providing operators, but I would rather recommend to use it with caution. Using this checkbox the user has to decide if the process is actual paralellizable, if not, these synchronization problems might occur. I would recommend using the dedicated parallel operators to be sure.
    We don't have implemented a parallel SMO, at least the X-Validation can be implemented parallel for each learning algorithm:
    Since the Predictions are made only once on a single fold, these write accesses don't interfere and the reading of the data for learning is indeed no problem, so it can be carried out parallely, too. So you can speed up a ten fold X-Validation up to factor of ten by using ten threads on a machine with at least ten cores. Beyond this, you would need not only to execute the single threaded learning algorithm ten times, but have to make this paralell, too.

  • dragoljubdragoljub MemberPosts:241Maven
    I have a 4 core system. I have explored preforming 10-fold X-validation using 10 threads and this also gave me problems. I am back to my default of 4 but occasionally if the 10-thread method works its still faster than 4 threads.

    Could this also be a sync problem?

    -Gagi ;D
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    sometimes using more threads than cores available avoids some still existent race conditions that prevent a thread from working all the time.

    What problems do you encounter? Could you be a little bit more precise so that I can check if there are still problems?

Sign InorRegisterto comment.