Spark Radoop connection
Hi everyone,
I am using Cloudera and the upgraded to Spark 2.2. I am having trouble when performing a Full Test. So in the configration what should be in "Spark Archive (or libs) path"?
I have tried getting the jar from (http://spark.apache.org/downloads.html)我没有能够找到“. . assembly.jar”文件。So I tried putting (local:///opt/cloudera/parcels/CDR/lib/spark/lib/spark-2.2.0-bin-hadoop2.6/jars/*), but didn't work. Also, I have tried the jars from (https://www.cloudera.com/documentation/spark2/latest/topics/spark2_packaging.html#packaging) with no luck.
(12月18日,2017 10:01:31 PM]: --------------------------------------------------
(12月18日,2017 10:01:31 PM]: Integration test for 'cluster (master)' started.
(12月18日,2017 10:01:31 PM]: Using Radoop version 8.0.0.
(12月18日,2017 10:01:31 PM]: Running tests: [Hive connection, Fetch dynamic settings, Java version, HDFS, MapReduce, Radoop temporary directory, MapReduce staging directory, Spark staging directory, Spark assembly jar existence, UDF jar upload, Create permanent UDFs, HDFS upload, Spark job]
(12月18日,2017 10:01:31 PM]: Running test 1/13: Hive connection
(12月18日,2017 10:01:31 PM]: Hive server 2 connection (master.c.strange-mason-188717.internal:10000) test started.
(12月18日,2017 10:01:31 PM]: Test succeeded: Hive connection (0.042s)
(12月18日,2017 10:01:31 PM]: Running test 2/13: Fetch dynamic settings
(12月18日,2017 10:01:31 PM]: Retrieving required configuration properties...
(12月18日,2017 10:01:31 PM]: Successfully fetched property: hive.execution.engine
(12月18日,2017 10:01:31 PM]: Successfully fetched property: mapreduce.jobhistory.done-dir
(12月18日,2017 10:01:31 PM]: Successfully fetched property: mapreduce.jobhistory.intermediate-done-dir
(12月18日,2017 10:01:31 PM]: Successfully fetched property: dfs.user.home.dir.prefix
(12月18日,2017 10:01:31 PM]: Could not fetch property dfs.encryption.key.provider.uri
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.executor.memory
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.executor.cores
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.driver.memory
(12月18日,2017 10:01:31 PM]: Could not fetch property spark.driver.cores
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.yarn.executor.memoryOverhead
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.yarn.driver.memoryOverhead
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.enabled
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.initialExecutors
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.minExecutors
(12月18日,2017 10:01:31 PM]: Successfully fetched property: spark.dynamicAllocation.maxExecutors
(12月18日,2017 10:01:31 PM]: Could not fetch property spark.executor.instances
(12月18日,2017 10:01:31 PM]: The specified local value of mapreduce.job.reduces (1) differs from remote value (-1).
(12月18日,2017 10:01:31 PM]: The specified local value of mapreduce.reduce.speculative (false) differs from remote value (true).
(12月18日,2017 10:01:31 PM]: The specified local value of mapreduce.job.redacted-properties (fs.s3a.access.key,fs.s3a.secret.key) differs from remote value (fs.s3a.access.key,fs.s3a.secret.key,yarn.app.mapreduce.am.admin.user.env,mapreduce.admin.user.env,hadoop.security.credential.provider.path).
(12月18日,2017 10:01:31 PM]: Test succeeded: Fetch dynamic settings (0.024s)
(12月18日,2017 10:01:31 PM]: Running test 3/13: Java version
(12月18日,2017 10:01:31 PM]: Cluster Java version: 1.8.0_151-b12
(12月18日,2017 10:01:31 PM]: Test succeeded: Java version (0.000s)
(12月18日,2017 10:01:31 PM]: Running test 4/13: HDFS
(12月18日,2017 10:01:31 PM]: Test succeeded: HDFS (0.125s)
(12月18日,2017 10:01:31 PM]: Running test 5/13: MapReduce
(12月18日,2017 10:01:31 PM]: Test succeeded: MapReduce (0.022s)
(12月18日,2017 10:01:31 PM]: Running test 6/13: Radoop temporary directory
(12月18日,2017 10:01:31 PM]: Test succeeded: Radoop temporary directory (0.007s)
(12月18日,2017 10:01:31 PM]: Running test 7/13: MapReduce staging directory
(12月18日,2017 10:01:31 PM]: Test succeeded: MapReduce staging directory (0.040s)
(12月18日,2017 10:01:31 PM]: Running test 8/13: Spark staging directory
(12月18日,2017 10:01:31 PM]: Test succeeded: Spark staging directory (0.020s)
(12月18日,2017 10:01:31 PM]: Running test 9/13: Spark assembly jar existence
(12月18日,2017 10:01:31 PM]: Spark assembly jar existence in the local:// file system cannot be checked. Test skipped.
(12月18日,2017 10:01:31 PM]: Test succeeded: Spark assembly jar existence (0.000s)
(12月18日,2017 10:01:31 PM]: Running test 10/13: UDF jar upload
(12月18日,2017 10:01:32 PM]: Remote radoop_hive-v4.jar is up to date.
(12月18日,2017 10:01:32 PM]: Test succeeded: UDF jar upload (0.007s)
(12月18日,2017 10:01:32 PM]: Running test 11/13: Create permanent UDFs
(12月18日,2017 10:01:32 PM]: Remote radoop_hive-v4.jar is up to date.
(12月18日,2017 10:01:32 PM]: Test succeeded: Create permanent UDFs (0.025s)
(12月18日,2017 10:01:32 PM]: Running test 12/13: HDFS upload
(12月18日,2017 10:01:32 PM]: Uploaded test data file size: 5642
(12月18日,2017 10:01:32 PM]: Test succeeded: HDFS upload (0.047s)
(12月18日,2017 10:01:32 PM]: Running test 13/13: Spark job
(12月18日,2017 10:01:32 PM]: Assuming Spark version Spark 2.2.
(12月18日,2017 10:01:32 PM] SEVERE: Test failed: Spark job
(2017年12月18日下午10:01:32]:清洗后测试:Spark job
(2017年12月18日下午10:01:32]:清洗后测试:HDFS upload
(2017年12月18日下午10:01:32]:清洗后测试:Create permanent UDFs
(2017年12月18日下午10:01:32]:清洗后测试:UDF jar upload
(2017年12月18日下午10:01:32]:清洗后测试:Spark assembly jar existence
(2017年12月18日下午10:01:32]:清洗后测试:Spark staging directory
(2017年12月18日下午10:01:32]:清洗后测试:MapReduce staging directory
(2017年12月18日下午10:01:32]:清洗后测试:Radoop temporary directory
(2017年12月18日下午10:01:32]:清洗后测试:MapReduce
(2017年12月18日下午10:01:32]:清洗后测试:HDFS
(2017年12月18日下午10:01:32]:清洗后测试:Java version
(2017年12月18日下午10:01:32]:清洗后测试:Fetch dynamic settings
(2017年12月18日下午10:01:32]:清洗后测试:Hive connection
(12月18日,2017 10:01:32 PM]: Total time: 0.732s
(12月18日,2017 10:01:32 PM]: java.lang.IllegalArgumentException: Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.
at org.apache.spark.deploy.yarn.Client.verifyClusterResources(Client.scala:311)
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:164)
at eu.radoop.datahandler.mapreducehdfs.YarnHandlerLowLevel.runSpark_invoke(YarnHandlerLowLevel.java:813)
at eu.radoop.datahandler.mapreducehdfs.YarnHandlerLowLevel.runSpark_invoke(YarnHandlerLowLevel.java:510)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel$2.run(MRHDFSHandlerLowLevel.java:650)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807)
at eu.radoop.datahandler.mapreducehdfs.MRHDFSHandlerLowLevel.invokeAs(MRHDFSHandlerLowLevel.java:646)
at sun.reflect.GeneratedMethodAccessor123.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1801)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.invokeAs(MapReduceHDFSHandler.java:1759)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.lambda$runSpark$26(MapReduceHDFSHandler.java:1021)
at eu.radoop.tools.ExceptionTools.checkOnly(ExceptionTools.java:474)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.runSpark(MapReduceHDFSHandler.java:1016)
at eu.radoop.datahandler.mapreducehdfs.MapReduceHDFSHandler.runSpark(MapReduceHDFSHandler.java:913)
在eu.radoop.connections.service.test.integration。TestSpark.runTestSparkJob(TestSpark.java:331)
在eu.radoop.connections.service.test.integration。TestSpark.runJobWithVersion(TestSpark.java:218)
在eu.radoop.connections.service.test.integration。TestSpark.call(TestSpark.java:109)
在eu.radoop.connections.service.test.integration。TestSpark.call(TestSpark.java:52)
at eu.radoop.connections.service.test.RadoopTestContext.lambda$runTest$0(RadoopTestContext.java:255)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run (Thread.java: 748)
(12月18日,2017 10:01:32 PM] SEVERE: java.lang.IllegalArgumentException: Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'.
(12月18日,2017 10:01:32 PM] SEVERE: The Spark test failed. Please verify your Hadoop and Spark version and check if your assembly jar location is correct. If the job failed, check the logs on the ResourceManager web interface athttp://master.c.strange-mason-188717.internal:8088.
(12月18日,2017 10:01:32 PM] SEVERE: Test failed: Spark job
(12月18日,2017 10:01:32 PM] SEVERE: Integration test for 'cluster (master)' failed.
Best Answer
-
amori MemberPosts:5Contributor II
Hi all,
I got it working! So, I had one slave node 2 vCPUs, 7.5 GB memory. I went to the Cloudera manager -> Yarn -> Configration ->
Container Memory yarn.nodemanager.resource.memory-mb = 7 GiB and Container Virtual CPU Cores yarn.nodemanager.resource.cpu-vcores = 2.
Also, I had to copy the jars file to the slave node, which I missed doing before.
The Result:
[Dec 20, 2017 9:39:31 PM]: Integration test for 'cluster3' completed successfully.
Thank you Peter for helping out your replies on the other posts were a tremendous guide2
Answers
Hi,
you may have, actually, set the Spark Archive path properly (local:// path seems correct), but the Spark client gives the following error:
"Required AM memory (1024+384 MB) is above the max threshold (1024 MB) of this cluster! Please increase the value of 'yarn.scheduler.maximum-allocation-mb'."
The memory max threshold setting on the cluster is not enough for running the Spark test job. 1 GB seems to be too low (the overhead itself is 0.5 GB). If this is a Virtual Machine, the RAM settings may have been low during installation andyarn.scheduler.maximum-allocation-mbmay have been calculated using that low value.
Best,
Peter
Thanks Peter!
I am using Cloudera the master node is an 8GB memory. So what I did is increase the "yarn.scheduler.maximum-allocation-mb" up to 8GB (from Yarn -> configuration) then restarted yarn but it didn't workout. Then I increased "yarn.nodemanager.resource.memory-mb" to 2GB and it worked out. But now I'm getting another error (shown below), I have tried to decreased theResource Allocation%in the Radoop connection from the default70%to50%, but it didn't workout.
Thanks,
[Dec 19, 2017 5:43:43 PM]: --------------------------------------------------
[Dec 19, 2017 5:43:43 PM]: Integration test for 'cluster2' started.
[Dec 19, 2017 5:43:43 PM]: Using Radoop version 8.0.0.
[Dec 19, 2017 5:43:43 PM]: Running tests: [Hive connection, Fetch dynamic settings, Java version, HDFS, MapReduce, Radoop temporary directory, MapReduce staging directory, Spark staging directory, Spark assembly jar existence, UDF jar upload, Create permanent UDFs, HDFS upload, Spark job]
[Dec 19, 2017 5:43:43 PM]: Running test 1/13: Hive connection
[Dec 19, 2017 5:43:43 PM]: Hive server 2 connection (master.c.strange-mason-188717.internal:10000) test started.
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Hive connection (0.105s)
[Dec 19, 2017 5:43:43 PM]: Running test 2/13: Fetch dynamic settings
[Dec 19, 2017 5:43:43 PM]: Retrieving required configuration properties...
(2017年12月19日下午5:43:43):成功获取公关operty: hive.execution.engine
(2017年12月19日下午5:43:43):成功获取公关operty: mapreduce.jobhistory.done-dir
(2017年12月19日下午5:43:43):成功获取公关operty: mapreduce.jobhistory.intermediate-done-dir
(2017年12月19日下午5:43:43):成功获取公关operty: dfs.user.home.dir.prefix
[Dec 19, 2017 5:43:43 PM]: Could not fetch property dfs.encryption.key.provider.uri
(2017年12月19日下午5:43:43):成功获取公关operty: spark.executor.memory
(2017年12月19日下午5:43:43):成功获取公关operty: spark.executor.cores
(2017年12月19日下午5:43:43):成功获取公关operty: spark.driver.memory
[Dec 19, 2017 5:43:43 PM]: Could not fetch property spark.driver.cores
(2017年12月19日下午5:43:43):成功获取公关operty: spark.yarn.executor.memoryOverhead
(2017年12月19日下午5:43:43):成功获取公关operty: spark.yarn.driver.memoryOverhead
(2017年12月19日下午5:43:43):成功获取公关operty: spark.dynamicAllocation.enabled
(2017年12月19日下午5:43:43):成功获取公关operty: spark.dynamicAllocation.initialExecutors
(2017年12月19日下午5:43:43):成功获取公关operty: spark.dynamicAllocation.minExecutors
(2017年12月19日下午5:43:43):成功获取公关operty: spark.dynamicAllocation.maxExecutors
[Dec 19, 2017 5:43:43 PM]: Could not fetch property spark.executor.instances
[Dec 19, 2017 5:43:43 PM]: The specified local value of mapreduce.job.reduces (1) differs from remote value (-1).
[Dec 19, 2017 5:43:43 PM]: The specified local value of mapreduce.reduce.speculative (false) differs from remote value (true).
[Dec 19, 2017 5:43:43 PM]: The specified local value of mapreduce.job.redacted-properties (fs.s3a.access.key,fs.s3a.secret.key) differs from remote value (fs.s3a.access.key,fs.s3a.secret.key,yarn.app.mapreduce.am.admin.user.env,mapreduce.admin.user.env,hadoop.security.credential.provider.path).
[Dec 19, 2017 5:43:43 PM]: The specified local value of yarn.scheduler.maximum-allocation-mb (8192) differs from remote value (2048).
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Fetch dynamic settings (0.022s)
[Dec 19, 2017 5:43:43 PM]: Running test 3/13: Java version
[Dec 19, 2017 5:43:43 PM]: Cluster Java version: 1.8.0_151-b12
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Java version (0.000s)
[Dec 19, 2017 5:43:43 PM]: Running test 4/13: HDFS
[Dec 19, 2017 5:43:43 PM]: Test succeeded: HDFS (0.117s)
[Dec 19, 2017 5:43:43 PM]: Running test 5/13: MapReduce
[Dec 19, 2017 5:43:43 PM]: Test succeeded: MapReduce (0.043s)
[Dec 19, 2017 5:43:43 PM]: Running test 6/13: Radoop temporary directory
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Radoop temporary directory (0.022s)
[Dec 19, 2017 5:43:43 PM]: Running test 7/13: MapReduce staging directory
[Dec 19, 2017 5:43:43 PM]: Test succeeded: MapReduce staging directory (0.023s)
[Dec 19, 2017 5:43:43 PM]: Running test 8/13: Spark staging directory
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Spark staging directory (0.028s)
[Dec 19, 2017 5:43:43 PM]: Running test 9/13: Spark assembly jar existence
[Dec 19, 2017 5:43:43 PM]: Spark assembly jar existence in the local:// file system cannot be checked. Test skipped.
[Dec 19, 2017 5:43:43 PM]: Test succeeded: Spark assembly jar existence (0.000s)
[Dec 19, 2017 5:43:43 PM]: Running test 10/13: UDF jar upload
[Dec 19, 2017 5:43:43 PM]: Remote radoop_hive-v4.jar is up to date.
[Dec 19, 2017 5:43:43 PM]: Test succeeded: UDF jar upload (0.085s)
[Dec 19, 2017 5:43:43 PM]: Running test 11/13: Create permanent UDFs
[Dec 19, 2017 5:43:43 PM]: Remote radoop_hive-v4.jar is up to date.
[Dec 19, 2017 5:43:44 PM]: Test succeeded: Create permanent UDFs (0.060s)
[Dec 19, 2017 5:43:44 PM]: Running test 12/13: HDFS upload
[Dec 19, 2017 5:43:44 PM]: Uploaded test data file size: 5642
[Dec 19, 2017 5:43:44 PM]: Test succeeded: HDFS upload (0.077s)
[Dec 19, 2017 5:43:44 PM]: Running test 13/13: Spark job
[Dec 19, 2017 5:43:44 PM]: Assuming Spark version Spark 2.2.
[Dec 19, 2017 5:47:44 PM] SEVERE: Test failed: Spark job
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Spark job
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: HDFS upload
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Create permanent UDFs
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: UDF jar upload
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Spark assembly jar existence
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Spark staging directory
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: MapReduce staging directory
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Radoop temporary directory
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: MapReduce
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: HDFS
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Java version
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Fetch dynamic settings
[Dec 19, 2017 5:47:44 PM]: Cleaning after test: Hive connection
[Dec 19, 2017 5:47:44 PM]: Total time: 240.708s
[Dec 19, 2017 5:47:44 PM] SEVERE: java.util.concurrent.TimeoutException
[Dec 19, 2017 5:47:44 PM] SEVERE: Timeout on the Spark test job. Please verify your Spark Resource Allocation settings on the Advanced Connection Properties window. You can check the logs of the Spark job on the ResourceManager web interface athttp://master.c.strange-mason-188717.internal:8088.
[Dec 19, 2017 5:47:44 PM] SEVERE: Test failed: Spark job
[Dec 19, 2017 5:47:44 PM] SEVERE: Integration test for 'cluster2' failed.
Hi,
good move on the memory settings front. Though, the second setting means that effectively only 2 GB is allocated per NodeManager / node, which is quite small in the YARN world. Resource calculations (like the heuristic percent in the connection) are realistic only when 8 GB memory is available to jobs (NodeManager) per node.
This timeout may be caused by the fact that the job did not get the resources, thus, did not started. This may caused by the fact that only the Spark driver got the resources, but not any executor (worker). Two ways to confirm: 1) accessing the Resource Manager web UI, where the running job and its resource allocation is visible 2) if you add Log Panel via View -> Show Panel in Studio Design View, then right click on it and set log level to FINE could show you after the test in the dialog, if the Spark job did not get the resources.
Hope this helps.
Peter
Thanks agailn Peter
You are right, I have checked and this is what I found:
0 MB-seconds, 0 vcore-seconds
Any idea why this is happening? currently, I only have one master node and one slave node, could that be the problem? because, I can't see that the slave node has been used. So, will adding more slave nodes solve this?