You are viewing the RapidMiner Radoop documentation for version 9.7 -Check here for latest version
Cloudera Hadoop CDH 5.x/6.x
Creating a Radoop connection
It is highly recommended to useNew Connection/Import from Cluster Manageroption to create the connection directly from the configuration retrieved from Cloudera Manager. If you do not have a Cloudera Manager account that has access to the configuration, an administrator should be able to下载客户端配置. Using the client configuration files, chooseNew Connection/Import Hadoop Configuration Filesto create the connection from those files.
If security is enabled on the cluster, make sure you checkConfiguring Apache Sentry authorizationsection of theHadoop Security一章。
Configuring Spark
If you are using Spark 1.6 version you may need to selectSpark 1.6 (CDH)for more recent CDH 5.x Cloudera Hadoop releases andSpark 1.6for older CDH 5.x releases. Select any of them and then run theSpark job test(enable only this test inFull Test.../Customize...) that automatically detects the proper version for you. Please choose the setting that this test recommends.
Using any other Spark version should be straightforward.