You are viewing the RapidMiner Radoop documentation for version 9.0 -Check here for latest version
RapidMiner Radoop Compatibility
Supported Hadoop distributions
RapidMiner Radoop works with most popular Hadoop distributions. Refer to the provider's documentation for information on configuring the Hadoop cluster. The supported distributions are:
- Amazon Elastic MapReduce (EMR)4.4+
- Apache Hadoop2.2+
- Azure HDInsight3.6
- Cloudera HadoopCDH5.x
- HortonworksHDP 2.x
- IBM Open Platform4.1+
- Mapr5.x, 6.x
- Open Data Platform0.9+
Supported data warehouse systems (DWS)
RapidMiner Radoop supports the following data warehouse infrastructures:
- Apache HiveServer2 0.13+
- Cloudera Impala 1.2.3 and later (see Impala limitations on theInstalling Radoop on Studiopage)
Supported Spark versions
RapidMiner Radoop supports the following Spark versions:
Apache Spark 1.2.x, 1.3.x and 1.4.x
- SupportsDecision Tree,Linear RegressionandLogistic Regressionoperators.
Apache Spark 1.5.x, 1.6.x, 2.0.x (except2.0.1), 2.1.x, 2.2.x
- Supports all Spark operators, includingSpark Script(Python and/or R is required on the cluster nodes),Single Process PushdownandSparkRM.
- Spark 2.0.1 minor versionis not supported.
Supported Java versions
RapidMiner Radoop requiresJava 8installed on the Hadoop cluster to operate. The nodes should have at least 8 GB of RAM.
RapidMiner extension compatibility
RapidMiner Radoop is not compatible with theParallel Processing Extension. This extension must be disabled when using Radoop. Please select theExtensions > Manage Extensions...menu item and uncheck the box forParallel Processing Extension.