Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 9.7 -Check here for latest version

Cloudera Hadoop CDH 5.x/6.x

Creating a Radoop connection

It is highly recommended to useNew Connection IconNew Connection/Import from Manager IconImport from Cluster Manageroption to create the connection directly from the configuration retrieved from Cloudera Manager. If you do not have a Cloudera Manager account that has access to the configuration, an administrator should be able to下载客户端配置. Using the client configuration files, chooseNew Connection IconNew Connection/Import Wizard IconImport Hadoop Configuration Filesto create the connection from those files.

If security is enabled on the cluster, make sure you checkConfiguring Apache Sentry authorizationsection of theHadoop Security一章。

Configuring Spark

If you are using Spark 1.6 version you may need to selectSpark 1.6 (CDH)for more recent CDH 5.x Cloudera Hadoop releases andSpark 1.6for older CDH 5.x releases. Select any of them and then run theSpark job test(enable only this test inFull Test IconFull Test.../Customize IconCustomize...) that automatically detects the proper version for you. Please choose the setting that this test recommends.

Using any other Spark version should be straightforward.