首页 > 解决方案 > WARN YarnScheduler:初始作业未接受任何资源。在 Eclipse 上以编程方式在 Yarn 上运行 Spark 作业

问题描述

我正在使用 Eclipse IDE 以编程方式使用纱线运行火花作业。我已经阅读了很多关于这个特定问题的答案,但没有一个能解决我的问题。我正在使用 Eclipse 2021-03 使用具有 8 GB 内存和 4 个内核的 Ubuntu 20 开发 OS Linux。测试在我的第一个用户 davben 上启动。Hadoop 和 spark 安装在我的第二个用户 hadoop 上。我为此测试设置了环境变量SPARK_HOME,指向/home/hadoop/spark,即安装spark的文件夹这里我

1:打开 Spark 会话,将 SparkConf 作为输入参数,

System.setProperty("hadoop.home.dir", "/home/hadoop/hadoop");
System.setProperty("hadoop.home.dir", "/home/hadoop/hadoop");
System.setProperty("SPARK_YARN_MODE", "yarn");
System.setProperty("HADOOP_USER_NAME", "hadoop");
SparkConf sparkConf = new SparkConf().setAppName("simpleTest2")
.setMaster("yarn") .set("spark.executor.memory", "1g")
.set("deploy.mode", "cluster")
.set("spark.yarn.stagingDir", "hdfs://localhost:9000/user/hadoop/")
.set("spark.yarn.am.memory", "512m")
.set("spark.dynamicAllocation.enabled", "false")
.set("spark.cores.max", "1")
.set("spark.yarn.executor.memoryOverhead", "500m")
.set("spark.executor.instances","2")
.set("spark.executor.memory","500m")
.set("spark.num.executors","2")
.set("spark.executor.cores","1")
.set("spark.worker.instances","1")
.set("spark.worker.memory","512m")
.set("spark.worker.max.heapsize","512m")
.set("spark.worker.cores","1")
.set("spark.yarn.nodemanager.resource.cpu-vcores","4")
.set("spark.yarn.submit.file.replication", "1")
SparkSession spark=SparkSession.builder().config(sparkConf).getOrCreate(); 

2:创建两行数据集并显示它们

List<Row> rows = new ArrayList<>();
rows.add(RowFactory.create("a", "b"));
rows.add(RowFactory.create("a", "a"));
StructType structType = new StructType();
structType =structType.add("edge_1",DataTypes.StringType, false);
structType = structType.add("edge_2",DataTypes.StringType, false);
ExpressionEncoder<Row> edgeEncoder =RowEncoder.apply(structType);
Dataset<Row> edge = spark.createDataset(rows, edgeEncoder);
edge.show();

 

从现在开始一切正常,作业在 hadoop 上提交并且行显示正确

3:我执行一个将行中元素大写的 Map

Dataset<Row> edge2 = edge.map(new MyFunction2(), edgeEncoder);



public static class MyFunction2 implements MapFunction<Row, Row> {
/** *  */ private static final long serialVersionUID = 1L;
@Override public Row call(Row v1) throws Exception {
String el1 = v1.get(0).toString().toUpperCase();
String el2 = v1.get(1).toString().toUpperCase();
return RowFactory.create(el1,el2); 
}
}

4:然后我在地图执行后显示数据集

edge2.show();

正是在这里,日志循环说 

> 21/09/28 11:18:51 WARN YarnScheduler: Initial job has not accepted any
> resources; check your cluster UI to ensure that workers are registered
> and have sufficient resources

这是日志的一部分

log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).log4j:WARN No appenders could be found for logger (org.apache.hadoop.metrics2.lib.MutableMetricsFactory).log4j:WARN Please initialize the log4j system properly.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties21/09/28 03:05:16 WARN Utils: Your hostname, davben-lubuntu resolves to a loopback address: 127.0.1.1; using 192.168.1.36 instead (on interface wlo1)21/09/28 03:05:16 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another addressWARNING: An illegal reflective access operation has occurredWARNING: Illegal reflective access by org.apache.spark.unsafe.Platform (file:/home/davben/.m2/repository/org/apache/spark/spark-unsafe_2.12/3.1.2/spark-unsafe_2.12-3.1.2.jar) to constructor java.nio.DirectByteBuffer(long,int)WARNING: Please consider reporting this to the maintainers of org.apache.spark.unsafe.PlatformWARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operationsWARNING: All illegal access operations will be denied in a future release21/09/28 03:05:16 WARN SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.21/09/28 03:05:16 WARN SparkConf: The configuration key 'spark.yarn.executor.memoryOverhead' has been deprecated as of Spark 2.3 and may be removed in the future. Please use the new key 'spark.executor.memoryOverhead' instead.21/09/28 03:05:16 INFO SparkContext: Running Spark version 3.1.221/09/28 03:05:16 INFO ResourceUtils: ==============================================================21/09/28 03:05:16 INFO ResourceUtils: No custom resources configured for spark.driver.21/09/28 03:05:16 INFO ResourceUtils: ==============================================================21/09/28 03:05:16 INFO SparkContext: Submitted application: simpleTest221/09/28 03:05:16 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(memoryOverhead -> name: memoryOverhead, amount: 500, script: , vendor: , cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 500, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)21/09/28 03:05:16 INFO ResourceProfile: Limiting resource is cpus at 1 tasks per executor21/09/28 03:05:16 INFO ResourceProfileManager: Added ResourceProfile id: 021/09/28 03:05:16 INFO SecurityManager: Changing view acls to: davben,hadoop21/09/28 03:05:16 INFO SecurityManager: Changing modify acls to: davben,hadoop21/09/28 03:05:16 INFO SecurityManager: Changing view acls groups to: 21/09/28 03:05:16 INFO SecurityManager: Changing modify acls groups to: 21/09/28 03:05:16 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(davben, hadoop); groups with view permissions: Set(); users  with modify permissions: Set(davben, hadoop); groups with modify permissions: Set()21/09/28 03:05:16 INFO Utils: Successfully started service 'sparkDriver' on port 38955.21/09/28 03:05:17 INFO SparkEnv: Registering MapOutputTracker21/09/28 03:05:17 INFO SparkEnv: Registering BlockManagerMaster21/09/28 03:05:17 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information21/09/28 03:05:17 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up21/09/28 03:05:17 INFO SparkEnv: Registering BlockManagerMasterHeartbeat21/09/28 03:05:17 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-33cfa046-2c7b-45b5-9cf1-c0f20195a98421/09/28 03:05:17 INFO MemoryStore: MemoryStore started with capacity 994.8 MiB21/09/28 03:05:17 INFO SparkEnv: Registering OutputCommitCoordinator21/09/28 03:05:17 INFO Utils: Successfully started service 'SparkUI' on port 4040.21/09/28 03:05:17 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://davben-lubuntu.home:404021/09/28 03:05:17 INFO RMProxy: Connecting to ResourceManager at /0.0.0.0:803221/09/28 03:05:18 INFO Client: Requesting a new application from cluster with 1 NodeManagers21/09/28 03:05:18 INFO Configuration: resource-types.xml not found21/09/28 03:05:18 INFO ResourceUtils: Unable to find 'resource-types.xml'.21/09/28 03:05:18 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)21/09/28 03:05:18 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead21/09/28 03:05:18 INFO Client: Setting up container launch context for our AM21/09/28 03:05:18 INFO Client: Setting up the launch environment for our AM container21/09/28 03:05:18 INFO Client: Preparing resources for our AM container21/09/28 03:05:18 INFO Client: Source and destination file systems are the same.INFO Client: Uploading resource file:/tmp/spark-3b097b0a-1f72-4cb1-9dd1-f419cdbd645f/__spark_conf__17045972699920737203.zip -> hdfs://localhost:9000/user/hadoop/hadoop/.sparkStaging/application_1632774234019_0016/__spark_conf__.zip21/09/28 03:05:19 INFO SecurityManager: Changing view acls to: davben,hadoop21/09/28 03:05:19 INFO SecurityManager: Changing modify acls to: davben,hadoop21/09/28 03:05:19 INFO SecurityManager: Changing view acls groups to: 21/09/28 03:05:19 INFO SecurityManager: Changing modify acls groups to: 21/09/28 03:05:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(davben, hadoop); groups with view permissions: Set(); users  with modify permissions: Set(davben, hadoop); groups with modify permissions: Set()21/09/28 03:05:19 INFO Client: Submitting application application_1632774234019_0016 to ResourceManager21/09/28 03:05:19 INFO YarnClientImpl: Submitted application application_1632774234019_001621/09/28 03:05:20 INFO Client: Application report for application_1632774234019_0016 (state: ACCEPTED)21/09/28 03:05:20 INFO Client:  client token: N/A diagnostics: AM container is launched, waiting for AM container to Register with RM ApplicationMaster host: N/A ApplicationMaster RPC port: -1 queue: default start time: 1632791119751 final status: UNDEFINED tracking URL: http://davben-lubuntu:8088/proxy/application_1632774234019_0016/ user: hadoop21/09/28 03:05:21 INFO Client: Application report for application_1632774234019_0016 (state: ACCEPTED)21/09/28 03:05:22 INFO Client: Application report for application_1632774234019_0016 (state: ACCEPTED)21/09/28 03:05:23 INFO Client: Application report for application_1632774234019_0016 (state: RUNNING)21/09/28 03:05:23 INFO Client:  client token: N/A diagnostics: N/A ApplicationMaster host: 192.168.1.36 ApplicationMaster RPC port: -1 queue: default start time: 1632791119751 final status: UNDEFINED tracking URL: http://davben-lubuntu:8088/proxy/application_1632774234019_0016/ user: hadoop21/09/28 03:05:23 INFO YarnClientSchedulerBackend: Application application_1632774234019_0016 has started running.21/09/28 03:05:23 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 33377.21/09/28 03:05:23 INFO NettyBlockTransferService: Server created on davben-lubuntu.home:3337721/09/28 03:05:23 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy21/09/28 03:05:23 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, davben-lubuntu.home, 33377, None)21/09/28 03:05:23 INFO BlockManagerMasterEndpoint: Registering block manager davben-lubuntu.home:33377 with 994.8 MiB RAM, BlockManagerId(driver, davben-lubuntu.home, 33377, None)21/09/28 03:05:23 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, davben-lubuntu.home, 33377, None)21/09/28 03:05:23 INFO BlockManager: external shuffle service port = 733721/09/28 03:05:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, davben-lubuntu.home, 33377, None)21/09/28 03:05:23 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> davben-lubuntu, PROXY_URI_BASES -> http://davben-lubuntu:8088/proxy/application_1632774234019_0016), /proxy/application_1632774234019_001621/09/28 03:05:24 INFO ServerInfo: Adding filter to /metrics/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:25 INFO YarnSchedulerBackend$YarnSchedulerEndpoint: ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)21/09/28 03:05:47 INFO YarnClientSchedulerBackend: SchedulerBackend is ready for scheduling beginning after waiting maxRegisteredResourcesWaitingTime: 30000000000(ns)21/09/28 03:05:49 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/home/davben/prometheux/projects/spark-reasoner/spark-warehouse').21/09/28 03:05:49 INFO SharedState: Warehouse path is 'file:/home/davben/prometheux/projects/spark-reasoner/spark-warehouse'.21/09/28 03:05:49 INFO ServerInfo: Adding filter to /SQL: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:49 INFO ServerInfo: Adding filter to /SQL/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:49 INFO ServerInfo: Adding filter to /SQL/execution: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:49 INFO ServerInfo: Adding filter to /SQL/execution/json: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:49 INFO ServerInfo: Adding filter to /static/sql: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter21/09/28 03:05:50 INFO CodeGenerator: Code generated in 169.2098 ms21/09/28 03:05:50 INFO CodeGenerator: Code generated in 13.789217 ms21/09/28 03:05:51 INFO CodeGenerator: Code generated in 17.287752 ms

> +------+------+
> 
> edge_1    edge_2

> +------+------+ 

>|     a|     b| 

>|     a|     a|

> 

> +------+------+

21/09/28 03:05:51 INFO CodeGenerator: Code generated in 45.072695 ms21/09/28 03:05:51 INFO SparkContext: Starting job: show at TrivialTests.java:124621/09/28 03:05:51 INFO DAGScheduler: Got job 0 (show at TrivialTests.java:1246) with 1 output partitions21/09/28 03:05:51 INFO DAGScheduler: Final stage: ResultStage 0 (show at TrivialTests.java:1246)21/09/28 03:05:51 INFO DAGScheduler: Parents of final stage: List()21/09/28 03:05:51 INFO DAGScheduler: Missing parents: List()21/09/28 03:05:51 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[2] at show at TrivialTests.java:1246), which has no missing parents21/09/28 03:05:51 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 13.3 KiB, free 994.8 MiB)21/09/28 03:05:51 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.3 KiB, free 994.8 MiB)21/09/28 03:05:51 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on davben-lubuntu.home:33377 (size: 5.3 KiB, free: 994.8 MiB)21/09/28 03:05:51 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:138821/09/28 03:05:51 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (MapPartitionsRDD[2] at show at TrivialTests.java:1246) (first 15 tasks are for partitions Vector(0))21/09/28 03:05:51 INFO YarnScheduler: Adding task set 0.0 with 1 tasks resource profile 021/09/28 03:06:06 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources21/09/28 03:06:21 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources21/09/28 03:06:36 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources21/09/28 03:06:51 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources

这很奇怪,因为如果在本地运行相同的作业 (setMaster(local[*])) 都可以正常工作。

标签: javaeclipseapache-sparkhadoophdfs

解决方案


推荐阅读