首页 > 解决方案 > Pyspark 挂在简单的命令上

问题描述

Pyspark 挂起以下输入。 请注意,它不会与 Scala 控制台一起挂起。

Python 3.6.5 (default, Jun 17 2018, 12:13:06) 
[GCC 4.2.1 Compatible Apple LLVM 9.1.0 (clang-902.0.39.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
2018-06-21 10:27:37 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
   /__ / .__/\_,_/_/ /_/\_\   version 2.3.1
      /_/

Using Python version 3.6.5 (default, Jun 17 2018 12:13:06)
SparkSession available as 'spark'.
>>> sc.parallelize((1,1)).count()     <-----------HANGS!

有人知道为什么会这样吗?我尝试重新安装所有东西,java,spark,homebrew,删除整个/usr/local目录。都没有想法。

不同的测试程序

from pyspark import SparkContext
sc = SparkContext.getOrCreate()
x = sc.parallelize((1,1)).count()
print("count: ", x)

火花提交的输出

Spark-Submit output, with a similar test python file output
2018-06-21 10:31:47 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-06-21 10:31:47 INFO  SparkContext:54 - Running Spark version 2.3.1
2018-06-21 10:31:47 INFO  SparkContext:54 - Submitted application: test_spark.py
2018-06-21 10:31:47 INFO  SecurityManager:54 - Changing view acls to: jonedoe
2018-06-21 10:31:47 INFO  SecurityManager:54 - Changing modify acls to: jonedoe
2018-06-21 10:31:47 INFO  SecurityManager:54 - Changing view acls groups to: 
2018-06-21 10:31:47 INFO  SecurityManager:54 - Changing modify acls groups to: 
2018-06-21 10:31:47 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(jonedoe); groups with view permissions: Set(); users  with modify permissions: Set(jonedoe); groups with modify permissions: Set()
2018-06-21 10:31:47 INFO  Utils:54 - Successfully started service 'sparkDriver' on port 61556.
2018-06-21 10:31:47 INFO  SparkEnv:54 - Registering MapOutputTracker
2018-06-21 10:31:47 INFO  SparkEnv:54 - Registering BlockManagerMaster
2018-06-21 10:31:47 INFO  BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-06-21 10:31:47 INFO  BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-06-21 10:31:47 INFO  DiskBlockManager:54 - Created local directory at /private/var/folders/gq/tm5q47gn6x363h5m_c86my_00000gp/T/blockmgr-5c0bfcf2-9009-46b5-bcd7-4fa5ec605a89
2018-06-21 10:31:47 INFO  MemoryStore:54 - MemoryStore started with capacity 366.3 MB
2018-06-21 10:31:47 INFO  SparkEnv:54 - Registering OutputCommitCoordinator
2018-06-21 10:31:48 INFO  log:192 - Logging initialized @2297ms
2018-06-21 10:31:48 INFO  Server:346 - jetty-9.3.z-SNAPSHOT
2018-06-21 10:31:48 INFO  Server:414 - Started @2378ms
2018-06-21 10:31:48 INFO  AbstractConnector:278 - Started ServerConnector@84802a{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
2018-06-21 10:31:48 INFO  Utils:54 - Successfully started service 'SparkUI' on port 4040.
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@79c67e6f{/jobs,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6889c329{/jobs/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3a8c9a58{/jobs/job,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e04f8ff{/jobs/job/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4832ee9d{/stages,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1632f399{/stages/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@398a3a30{/stages/stage,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2eb62024{/stages/stage/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4685c478{/stages/pool,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@31053558{/stages/pool/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@537d3185{/storage,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@4c559cce{/storage/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@249b3738{/storage/rdd,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3c2c6906{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e7861f{/environment,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@66b4d9e1{/environment/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1b6b10f8{/executors,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@44502eca{/executors/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7ebd8f21{/executors/threadDump,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3e862ac6{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7d29113e{/static,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@388c37ce{/,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@22374681{/api,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@dcbeb70{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@322ceede{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-06-21 10:31:48 INFO  SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://ip-192-168-65-180.ec2.internal:4040
2018-06-21 10:31:48 INFO  SparkContext:54 - Added file file:/Users/jonedoe/code/test_spark.py at file:/Users/jonedoe/code/test_spark.py with timestamp 1529602308500
2018-06-21 10:31:48 INFO  Utils:54 - Copying /Users/jonedoe/code/test_spark.py to /private/var/folders/gq/tm5q47gn6x363h5m_c86my_00000gp/T/spark-99983724-420e-4bc0-ad1f-3bc41bba9114/userFiles-999bdcde-1e5d-4e9a-98ce-c6ecdaee0739/test_spark.py
2018-06-21 10:31:48 INFO  Executor:54 - Starting executor ID driver on host localhost
2018-06-21 10:31:48 INFO  Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 61557.
2018-06-21 10:31:48 INFO  NettyBlockTransferService:54 - Server created on ip-192-168-65-180.ec2.internal:61557
2018-06-21 10:31:48 INFO  BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-06-21 10:31:48 INFO  BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, ip-192-168-65-180.ec2.internal, 61557, None)
2018-06-21 10:31:48 INFO  BlockManagerMasterEndpoint:54 - Registering block manager ip-192-168-65-180.ec2.internal:61557 with 366.3 MB RAM, BlockManagerId(driver, ip-192-168-65-180.ec2.internal, 61557, None)
2018-06-21 10:31:48 INFO  BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, ip-192-168-65-180.ec2.internal, 61557, None)
2018-06-21 10:31:48 INFO  BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, ip-192-168-65-180.ec2.internal, 61557, None)
2018-06-21 10:31:48 INFO  ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2d1fafea{/metrics/json,null,AVAILABLE,@Spark}
2018-06-21 10:31:49 INFO  SparkContext:54 - Starting job: count at /Users/jonedoe/code/test_spark.py:4
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Got job 0 (count at /Users/jonedoe/code/test_spark.py:4) with 8 output partitions
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Final stage: ResultStage 0 (count at /Users/jonedoe/code/test_spark.py:4)
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Parents of final stage: List()
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Missing parents: List()
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Submitting ResultStage 0 (PythonRDD[1] at count at /Users/jonedoe/code/test_spark.py:4), which has no missing parents
2018-06-21 10:31:49 INFO  MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 5.0 KB, free 366.3 MB)
2018-06-21 10:31:49 INFO  MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 3.4 KB, free 366.3 MB)
2018-06-21 10:31:49 INFO  BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on ip-192-168-65-180.ec2.internal:61557 (size: 3.4 KB, free: 366.3 MB)
2018-06-21 10:31:49 INFO  SparkContext:54 - Created broadcast 0 from broadcast at DAGScheduler.scala:1039
2018-06-21 10:31:49 INFO  DAGScheduler:54 - Submitting 8 missing tasks from ResultStage 0 (PythonRDD[1] at count at /Users/jonedoe/code/test_spark.py:4) (first 15 tasks are for partitions Vector(0, 1, 2, 3, 4, 5, 6, 7))
2018-06-21 10:31:49 INFO  TaskSchedulerImpl:54 - Adding task set 0.0 with 8 tasks
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 7858 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 4.0 in stage 0.0 (TID 4, localhost, executor driver, partition 4, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 5.0 in stage 0.0 (TID 5, localhost, executor driver, partition 5, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 6.0 in stage 0.0 (TID 6, localhost, executor driver, partition 6, PROCESS_LOCAL, 7839 bytes)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Starting task 7.0 in stage 0.0 (TID 7, localhost, executor driver, partition 7, PROCESS_LOCAL, 7858 bytes)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 3.0 in stage 0.0 (TID 3)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 2.0 in stage 0.0 (TID 2)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 4.0 in stage 0.0 (TID 4)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 1.0 in stage 0.0 (TID 1)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 6.0 in stage 0.0 (TID 6)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 7.0 in stage 0.0 (TID 7)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 0.0 in stage 0.0 (TID 0)
2018-06-21 10:31:49 INFO  Executor:54 - Running task 5.0 in stage 0.0 (TID 5)
2018-06-21 10:31:49 INFO  Executor:54 - Fetching file:/Users/jonedoe/code/test_spark.py with timestamp 1529602308500
2018-06-21 10:31:49 INFO  Utils:54 - /Users/jonedoe/code/test_spark.py has been previously copied to /private/var/folders/gq/tm5q47gn6x363h5m_c86my_00000gp/T/spark-99983724-420e-4bc0-ad1f-3bc41bba9114/userFiles-999bdcde-1e5d-4e9a-98ce-c6ecdaee0739/test_spark.py
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 397, boot = 389, init = 8, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 399, boot = 396, init = 3, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 406, boot = 403, init = 3, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 413, boot = 410, init = 3, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 420, boot = 417, init = 3, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 426, boot = 423, init = 2, finish = 1
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 433, boot = 430, init = 3, finish = 0
2018-06-21 10:31:49 INFO  PythonRunner:54 - Times: total = 441, boot = 437, init = 3, finish = 1
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 5.0 in stage 0.0 (TID 5). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 2.0 in stage 0.0 (TID 2). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 3.0 in stage 0.0 (TID 3). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 6.0 in stage 0.0 (TID 6). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 7.0 in stage 0.0 (TID 7). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 4.0 in stage 0.0 (TID 4). 1267 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 1.0 in stage 0.0 (TID 1). 1310 bytes result sent to driver
2018-06-21 10:31:49 INFO  Executor:54 - Finished task 0.0 in stage 0.0 (TID 0). 1310 bytes result sent to driver
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Finished task 5.0 in stage 0.0 (TID 5) in 580 ms on localhost (executor driver) (1/8)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Finished task 3.0 in stage 0.0 (TID 3) in 586 ms on localhost (executor driver) (2/8)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Finished task 2.0 in stage 0.0 (TID 2) in 587 ms on localhost (executor driver) (3/8)
2018-06-21 10:31:49 INFO  TaskSetManager:54 - Finished task 6.0 in stage 0.0 (TID 6) in 583 ms on localhost (executor driver) (4/8)
2018-06-21 10:31:50 INFO  TaskSetManager:54 - Finished task 4.0 in stage 0.0 (TID 4) in 586 ms on localhost (executor driver) (5/8)
2018-06-21 10:31:50 INFO  TaskSetManager:54 - Finished task 7.0 in stage 0.0 (TID 7) in 584 ms on localhost (executor driver) (6/8)
2018-06-21 10:31:50 INFO  TaskSetManager:54 - Finished task 0.0 in stage 0.0 (TID 0) in 608 ms on localhost (executor driver) (7/8)
2018-06-21 10:31:50 INFO  TaskSetManager:54 - Finished task 1.0 in stage 0.0 (TID 1) in 590 ms on localhost (executor driver) (8/8)
2018-06-21 10:31:50 INFO  TaskSchedulerImpl:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool 
2018-06-21 10:31:50 INFO  DAGScheduler:54 - ResultStage 0 (count at /Users/jonedoe/code/test_spark.py:4) finished in 0.774 s
2018-06-21 10:31:50 INFO  DAGScheduler:54 - Job 0 finished: count at /Users/jonedoe/code/test_spark.py:4, took 0.825530 s

挂在这里之后…………

标签: apache-sparkpyspark

解决方案


看起来我的防病毒软件(Bitdefender)是罪魁祸首。

由于某种原因,它阻挡了火花。


推荐阅读