Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 9.7 -Check here for latest version

Radoop Proxy

Radoop Proxy lets you tunnel all Radoop connections through a single machine residing on the edge of your secure Hadoop cluster. Its purpose is to significantly reduce the number of ports that need to be opened on the firewall protecting the Hadoop cluster, making the networking configuration much easier.

Radoop Proxy architecture

Radoop Proxy is a component that is conveniently shipped in multiple ways:

  • bundled with RapidMiner Server
  • as aseparate Docker container(which can be used as a standalone instance or as part of a RapidMiner Server deployment)

It can accept connections from RapidMiner Studio and forward these connections towards Hadoop, forming a single access point to the cluster. It is typically installed on one of the secured cluster machines (this can be an existing Hadoop node or a dedicated machine / edge node), so it resides on the same local network as the cluster nodes. To allow outside access to the secured cluster, only ahandful of portsneed to be opened on the firewall, making thedefault networking setupobsolete.

To ensure security, RapidMiner Studio users must be authenticated when using the Radoop Proxy. Authentication is done by a RapidMiner Server that is connected to the Radoop Proxy, or with a standalone username/password combination (depending on Radoop Proxy configuration).

The connection between RapidMiner Studio users and Radoop Proxy can be easily secured with SSL if the certificates for the machine hosting the proxy is available. Since bothRapidMiner服务器用户and Hadoop users can be managed in a central LDAP server, this enables a centralized and convenient user management as well.

Setting up a Radoop Proxy Connection

  1. Create a newRadoop Proxy Connectionfrom RapidMiner Studioorfrom RapidMiner Server. The fields on the Setup tab are explained below:

    Field Description
    Radoop Proxy Server host Radoop代理服务器的IP地址或主机名
    Radoop Proxy Server port Radoop Proxy Server port. Default is 1081
    Use Enterprise SSO Use the Enterprise SSO token from the parent Repository. Only displayed with Radoop Proxy Connections that are in Repository Panel. Should only be used with a Radoop Proxy Server configured with SAML authentication seeCustom Radoop Proxy Installationfor details.RapidMiner Server userandRapidMiner Server passwordare disabled when this feature is used.
    RapidMiner Server user Username for authenticating. This is disabled whenUse Enterprise SSOis checked.
    RapidMiner Server password Password to use for the connection. This is disabled whenUse Enterprise SSOis checked.
    Use secure (SSL) connection Use SSL for the proxy connection. SeeRadoop Proxy securityfor details. If your certificate is NOT well-known then you need to fill inKeystore fileandKeystore password.
    Keystore file (SSL) Keystore file that contains the SSL cert to use, if cert is well known, you can leave this empty. This is disabled whenUse secure (SSL) connectionis unchecked.
    Keystore password (SSL) password to use to unlock the keystore, if keystore has no password, you can leave this empty. This is disabled whenUse secure (SSL) connectionis unchecked.
  2. In the连接menu, selectEdit data connection IconManage Radoop Connections, edit your connection by clickingConfigure, and on theRadoop Proxytab checkUse Radoop Proxy:

  3. Select the location of your proxy definition from the first dropdown selector in theRadoop Proxy Connectionsection. ChooseLocal Repositoryfor connection set up in Studio or the name of the Server repository in case of remote connections.

  4. Select the connection from the second dropdown selector in theRadoop Proxy Connectionsection. In case of server locations you may need to click on the刷新按钮Refreshbutton to sync the connections from the Server to Studio. You also have the option to click onEdit buttonEditbutton to edit an existing or to create a new Radoop Proxy connection right from this screen.