Categories

Versions

You are viewing the RapidMiner Radoop documentation for version 9.5 -Check here for latest version

Connection Errors

YourConnection Testor Full Connection Test may fail for various reasons. The most frequent issues are listed here. If you can't fix the problem using one of the procedures below, please see the RapidMinerSupport portal.

Hive Connection Test Timeout

You may experience this issue when the蜂巢连接test step returns with a timeout. The test uses a timeout for each small test operation. The Hadoop connection itself is also guarded with a separate timeout value. Both timeout values default to30 seconds. If the Hadoop cluster is busy, or the network latency is either high or varies to a large extent, increasing the two timeout values should solve the problem. You can also find a detailed explanation on all the connection parameters in theAdvanced Settings section of the Configuring Radoop Connections page.

你可以增加这些timeout values in RapidMiner Studio as follows:

  1. From the连接menu, open theIconManage Radoop Connections...menu item:

  2. Find your Radoop connection, select it and clickIconConfigure. The connection settings will open.

  3. On theHadooptab, find and increase the value of theConnection timeoutparameter.

    Tip: Use the **Filter** textbox at the top of the window to locate the parameters easier.
    Tip: Use theFiltertextbox at the top of the window to locate the parameters easier.

  4. On theHivetab, find and increase the value of theHive command timeoutparameter. ClickHive View IconOK.

  5. Rerun your connection test of choice to see if the timeout increase has solved the problem. Remember toSaveSaveyour connection before closing the dialog for the settings to be persisted.

你可以增加这些timeout values in RapidMiner Server as follows:

  1. Locate and edit yourradoop_connections.xmlwith the editor of your choice. The recommended location for this file is the.RapidMinerdirectory within your Server home folder.

  2. Find the XML tags namedconnection_timeout_secondsandhive_command_timeout_secondsnested in the Radoop connection entry you want to adjust, such as:

      ... your-connection ... 30 30 ...  
  3. Save the file, then restart RapidMiner Server for the settings to take effect, and for the Sync service to distribute the updated file to all Job Agents in your deployment.

Note that when you start the Hive service for the first time, it usually takes up to 15 seconds to initialize the metastore. If you get a timeout error when doing a fresh Hive install, wait a few seconds and try again.

To troubleshoot a timeout error, for the timed-out command that the client sent,check the log of the Hive serverinstance on the cluster node. The log indicates whether the command reached the server and whether the server sent a response back in time.

Timeout on Hive Import Test

If the Hive Import job times out in the Full Connection Test Radoop most likely can't communicate with the Job History Server. You can verify this by opening theResource Manager web interface. If you see asucceededMapReduce job under theApplicationsmenu with the name "Radoop Import CSV job" you should check if your Job History Server is running and is accessible on the corresponding port (10020 by default). You can set the Job History Server Address (if Multiple Masters checkbox is enabled) and Job History Server Port on theConnection Settingsdialog to fix the issue.

Permission Issues

All users of the RapidMiner Radoop client must have permissions to either create a/tmpdirectory on the HDFS or to create subdirectories under it. If you get an error message regarding permission issues, consult with your Hadoop administrator.

RapidMiner Radoop may also return a permissions error for the.stagingdirectory of the current user. The error message reports the directory path and the missing permission. Again, consult your Hadoop administrator for help. Note that, for security reasons, the HDFS.stagingdirectory of a user must only grant access rights for that specific user.