Radoop and Hortonworks Sandbox Connection Problem

personablepersonable MemberPosts:1Contributor I
edited November 2018 inHelp

Dear friends

我有一个问题while connecting from Rpidminer7.3 Radoop to Hortonworks sandbox.

I have installed the following hortonworks sandbox on the vmware workstation sandbox HDP_2.5_docker_vmware_25_10_2016_08_59_25_hdp_2_5_0_0_1245_ambari_2_4_0_0_1225.ovf

and also applied the Distribution-Specific Notes of Radoop documents on it

http://docs.www.turtlecreekpls.com/radoop/installation/distribution-notes.html#hdp-sandbox

But when I make a connection from Radoop to the sandbox and run Quick Test, I get the following error(Screenshots are included)

[Dec 21, 2016 6:25:57 PM] SEVERE: com.rapidminer.operator.UserError: Could not upload the necessary component to the directory on the HDFS: '/tmp/radoop/_shared/db_default/'
[Dec 21, 2016 6:25:57 PM] SEVERE: Hive jar (with additional functions) upload failed. Please check that the NameNode and DataNodes run and are accessible on the address and port you specified.
[Dec 21, 2016 6:25:57 PM] SEVERE: Test failed: UDF jar upload
[Dec 21, 2016 6:25:57 PM] SEVERE: Connection test for 'Hortonworks_Hadoop' failed.

Regards

e1.jpge3.jpgwhen quick test is pressed a file will be created in db_defaulte2.jpg

e1.jpg 0B
e2.jpg 0B
e3.jpg 0B
Tagged:

Best Answer

  • phellingerphellinger Employee, MemberPosts:103RM Engineering
    Solution Accepted

    Hi All,

    We have updated the guide to connecting to the latest Hortonworks Sandbox virtual machine. Thoroughly following the steps should solve the above issues.

    Please follow the guide athttp://docs.www.turtlecreekpls.com/radoop/installation/distribution-notes.html.

    For those interested in technical details, here is some explanation. The Hortonworks Sandbox connection problems appeared as Hortonworks updated their Sandbox environment, so that now Hadoop runs on Docker inside the Virtual Box. After this change in the networking, a hostname must be used to access the DataNodes, because it can be resolved to either the external or the internal IP depending on where it is resolved. Moreover, not all ports are exposed properly, that's why we need to add the permanent iptables rules as a workaround.

    Best,

    Peter

    Thomas_Ott yyhuang

Answers

  • zprekopcsakzprekopcsak RapidMiner认证专家,MemberPosts:47Guru

    Hi,

    Please try the following Advanced Hadoop Parameter:

    Key = dfs.client.use.datanode.hostname
    Value = true

    Best, Zoltan

  • yyhuangyyhuang Administrator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:363RM Data Scientist

    I also has the same issues.

    Tried to import hadoop configuration files and also import from cluster manager. Added the extra advanced hadoop parameters as@zprekopcsakinstructed.

    But I still get the error

    (1月19日,2017 10:25:46 AM]: Connection test for 'Sandbox (192.168.8.128)' started.
    (1月19日,2017 10:25:46 AM]: Using Radoop version 7.4.0-ALPHA.
    (1月19日,2017 10:25:46 AM]: Running tests: [Hive connection, Fetch dynamic settings, Java version, HDFS, MapReduce, Radoop temporary directory, MapReduce staging directory, Spark staging directory, Spark assembly jar existence, UDF jar upload, Create permanent UDFs]
    (1月19日,2017 10:25:46 AM]: Running test 1/11: Hive connection
    (1月19日,2017 10:25:46 AM]: Hive server 2 connection (sandbox.hortonworks.com:10000) test started.
    (1月19日,2017 10:25:46 AM]: Test succeeded: Hive connection (0.141s)
    (1月19日,2017 10:25:46 AM]: Running test 2/11: Fetch dynamic settings
    (1月19日,2017 10:25:46 AM]: Retrieving required configuration properties...
    (1月19日,2017 10:25:46 AM]: Successfully fetched property: hive.execution.engine
    (1月19日,2017 10:25:46 AM]: Successfully fetched property: dfs.user.home.dir.prefix
    (1月19日,2017 10:25:46 AM]: Successfully fetched property: system:hdp.version
    (1月19日,2017 10:25:46 AM]: The specified local value of mapreduce.reduce.speculative (false) differs from remote value (true).
    (1月19日,2017 10:25:46 AM]: The specified local value of dfs.client.use.datanode.hostname (true) differs from remote value (false).
    (1月19日,2017 10:25:46 AM]: The specified local value of yarn.nodemanager.admin-env (MALLOC_ARENA_MAX=$MALLOC_ARENA_MAX) differs from remote value (MALLOC_ARENA_MAX).
    (1月19日,2017 10:25:46 AM]: The specified local value of yarn.app.mapreduce.am.command-opts (-Xmx409m -Dhdp.version=${hdp.version}) differs from remote value (-Xmx200m).
    (1月19日,2017 10:25:46 AM]: The specified local value of mapreduce.admin.map.child.java.opts (-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}) differs from remote value (-server -XX:NewRatio).
    (1月19日,2017 10:25:46 AM]: The specified local value of mapreduce.admin.reduce.child.java.opts (-server -XX:NewRatio=8 -Djava.net.preferIPv4Stack=true -Dhdp.version=${hdp.version}) differs from remote value (-server -XX:NewRatio).
    (1月19日,2017 10:25:46 AM]: The specified local value of yarn.nodemanager.recovery.dir ({{yarn_log_dir_prefix}}/nodemanager/recovery-state) differs from remote value (/var/log/hadoop-yarn/nodemanager/recovery-state).
    (1月19日,2017 10:25:46 AM]: The specified local value of yarn.app.mapreduce.am.admin-command-opts (-Dhdp.version=${hdp.version}) differs from remote value (-Dhdp.version).
    (1月19日,2017 10:25:46 AM]: The specified local value of mapreduce.admin.user.env (LD_LIBRARY_PATH=/usr/hdp/${hdp.version}/hadoop/lib/native:/usr/hdp/${hdp.version}/hadoop/lib/native/Linux-amd64-64) differs from remote value (LD_LIBRARY_PATH).
    (1月19日,2017 10:25:46 AM]: Test succeeded: Fetch dynamic settings (0.209s)
    (1月19日,2017 10:25:46 AM]: Running test 3/11: Java version
    (1月19日,2017 10:25:46 AM]: Cluster Java version: 1.8.0_111-b15
    (1月19日,2017 10:25:46 AM]: Test succeeded: Java version (0.000s)
    (1月19日,2017 10:25:46 AM]: Running test 4/11: HDFS
    (1月19日,2017 10:25:46 AM]: Test succeeded: HDFS (0.151s)
    (1月19日,2017 10:25:46 AM]: Running test 5/11: MapReduce
    (1月19日,2017 10:25:46 AM]: Test succeeded: MapReduce (0.088s)
    (1月19日,2017 10:25:46 AM]: Running test 6/11: Radoop temporary directory
    (1月19日,2017 10:25:46 AM]: Test succeeded: Radoop temporary directory (0.011s)
    (1月19日,2017 10:25:46 AM]: Running test 7/11: MapReduce staging directory
    (1月19日,2017 10:25:46 AM]: Test succeeded: MapReduce staging directory (0.087s)
    (1月19日,2017 10:25:46 AM]: Running test 8/11: Spark staging directory
    (1月19日,2017 10:25:46 AM]: Test succeeded: Spark staging directory (0.017s)
    (1月19日,2017 10:25:46 AM]: Running test 9/11: Spark assembly jar existence
    (1月19日,2017 10:25:46 AM]: Spark assembly jar existence in the local:// file system cannot be checked. Test skipped.
    (1月19日,2017 10:25:46 AM]: Test succeeded: Spark assembly jar existence (0.000s)
    (1月19日,2017 10:25:46 AM]: Running test 10/11: UDF jar upload
    (1月19日,2017 10:25:46 AM]: File uploaded: 97.01 KB written in 0 seconds (67.72 MB/sec)
    (1月19日,2017 10:25:48 AM] SEVERE: File /tmp/radoop/_shared/db_default/radoop_hive-v4_UPLOADING_1484839546975_xdi3i5w.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
    at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1641)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3198)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3122)
    at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:843)
    at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:500)
    at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
    at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307)

    (1月19日,2017 10:25:48 AM] SEVERE: Test failed: UDF jar upload
    (1月19日,2017 10:25:48 AM]: Cleaning after test: UDF jar upload
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Spark assembly jar existence
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Spark staging directory
    (1月19日,2017 10:25:48 AM]: Cleaning after test: MapReduce staging directory
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Radoop temporary directory
    (1月19日,2017 10:25:48 AM]: Cleaning after test: MapReduce
    (1月19日,2017 10:25:48 AM]: Cleaning after test: HDFS
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Java version
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Fetch dynamic settings
    (1月19日,2017 10:25:48 AM]: Cleaning after test: Hive connection
    (1月19日,2017 10:25:48 AM]: Total time: 1.761s
    (1月19日,2017 10:25:48 AM] SEVERE: com.rapidminer.operator.UserError: Could not upload the necessary component to the directory on the HDFS: '/tmp/radoop/_shared/db_default/'
    (1月19日,2017 10:25:48 AM] SEVERE: Hive jar (with additional functions) upload failed. Please check that the NameNode and DataNodes run and are accessible on the address and port you specified.
    (1月19日,2017 10:25:48 AM] SEVERE: Test failed: UDF jar upload
    (1月19日,2017 10:25:48 AM] SEVERE: Connection test for 'Sandbox (192.168.8.128)' failed.
Sign InorRegisterto comment.