Radoop Connection Issue: UDF Jar Upload

dorina124cdorina124c MemberPosts:2Contributor I
edited November 2018 inHelp

Hi,

我想添加一个与Hortonw Radoop联系orks Sandbox on Azure, but i'm stuck at the 10/11 test: UDF jar upload. I am a newbie in Hadoop and would appreciate a lot any help or advice.

I'm getting the following error message:

[Dec 29, 2016 4:01:29 PM]: Running test 10/11: UDF jar upload
[Dec 29, 2016 4:01:29 PM]: File uploaded: 97.04 KB written in 0 seconds (12.01 MB/sec)
[Dec 29, 2016 4:01:51 PM] SEVERE: File /tmp/radoop/_shared/db_default/radoop_hive-v4_UPLOADING_1483023689486_pisyasi.jar could only be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and 1 node(s) are excluded in this operation.
在org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1588)
在org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getNewBlockTargets(FSNamesystem.java:3116)
在org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:3040)
在org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:789)
在org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:492)
在org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
在org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
在org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
在org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2151)
在org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2147)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
在org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
在org.apache.hadoop.ipc.Server$Handler.run(Server.java:2145)

[Dec 29, 2016 4:01:52 PM] SEVERE: Test failed: UDF jar upload

Tagged:

Best Answer

  • phellingerphellinger Employee, MemberPosts:103RM Engineering
    Solution Accepted

    Hi,

    We have updated the guide to connecting to the latest Hortonworks Sandbox virtual machine. Thoroughly following the steps should solve the above issues.

    Please follow the guide athttp://docs.www.turtlecreekpls.com/radoop/installation/distribution-notes.html.

    For those interested in technical details, here is some explanation. The Hortonworks Sandbox connection problems appeared as Hortonworks updated their Sandbox environment, so that now Hadoop runs on Docker inside the Virtual Box. After this change in the networking, a hostname must be used to access the DataNodes, because it can be resolved to either the external or the internal IP depending on where it is resolved. Moreover, not all ports are exposed properly, that's why we need to add the permanent iptables rules as a workaround.

    Best,

    Peter

Answers

  • phellingerphellinger Employee, MemberPosts:103RM Engineering

    Hi,

    there is another (solved) topic about connecting to Sandbox in Azure. See the solution below.

    Please add this advanced Hadoop property to the connection with atruevalue:dfs.client.use.datanode.hostname. (In this case, the DataNode is expected to be accessed viasandbox.hortonworks.com.)

    dfs_client_use_datanode_hostname.png

    Best,

    Peter

    akunyer
  • dorina124cdorina124c MemberPosts:2Contributor I

    Hi Peter,

    Thanks a lot for your reply. I tried previously the solution that you suggested, but I'm still getting stuck at the same test. I'm getting the following error:

    [Jan 4, 2017 12:16:39 PM] SEVERE: DataStreamer Exception:
    [Jan 4, 2017 12:16:39 PM] SEVERE: Test failed: UDF jar upload

  • phellingerphellinger Employee, MemberPosts:103RM Engineering

    Hi,

    if the error message is different, the property may have had an effect, but there are further errors.

    The upload to HDFS was unsuccessful, the file could not be replicated to any nodes. The NameNode web interface (typically accessible at :50070 via a browser) usually shows if the DataNodes are unhealthy (Live Nodes vs Dead Nodes). For example, this is the case when the disks are full, or any problem prevents the NameNode or DataNodes to function.

    Best,

    Peter

Sign InorRegisterto comment.