首页 > 解决方案 > 在 Macbook 上运行 pyspark 时出现 Java 异常

问题描述

我刚开始学习 Spark,在启动交互式 shell 时遇到了一些问题。当我pyspark在命令行中运行时,我得到以下输出:

Python 3.8.2 (default, Dec 21 2020, 15:06:04)
[Clang 12.0.0 (clang-1200.0.32.29)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Exception in thread "main" java.lang.ExceptionInInitializerError
    at org.apache.spark.unsafe.array.ByteArrayMethods.<clinit>(ByteArrayMethods.java:54)
    at org.apache.spark.internal.config.package$.<init>(package.scala:1095)
    at org.apache.spark.internal.config.package$.<clinit>(package.scala)
    at org.apache.spark.deploy.SparkSubmitArguments.$anonfun$loadEnvironmentArguments$3(SparkSubmitArguments.scala:157)
    at scala.Option.orElse(Option.scala:447)
    at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:157)
    at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:115)
    at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$3.<init>(SparkSubmit.scala:1022)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:1022)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:85)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make private java.nio.DirectByteBuffer(long,int) accessible: module java.base does not "opens java.nio" to unnamed module @13545af8
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:357)
    at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
    at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:188)
    at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:181)
    at org.apache.spark.unsafe.Platform.<clinit>(Platform.java:56)
    ... 13 more
Traceback (most recent call last):
  File "/Users/yxiong/sys/python-venv/lib/python3.8/site-packages/pyspark/python/pyspark/shell.py", line 35, in <module>
    SparkContext._ensure_initialized()  # type: ignore
  File "/Users/yxiong/sys/python-venv/lib/python3.8/site-packages/pyspark/context.py", line 331, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/Users/yxiong/sys/python-venv/lib/python3.8/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number
>>>

有关我的环境的更多详细信息:

  1. 我在 macOS Catalina 10.15.7 上。

  2. python版本是3.8.2,我virtualenv用来管理我的python包。

  3. 我从我的 virtualenv 安装了 pyspark,pip install pyspark在安装过程中没有出现错误。我还尝试从 apache 站点下载预构建的spark-3.1.2-bin-haddop3.2tarball 并bin/pyspark从他们那里运行,但得到与上述相同的错误。

  4. 我从 oracle 网站安装并jre-8u301显示以下内容。我在我的(评估为)中添加了,但它没有解决上述问题。jdk-16.0.2java -versionsexport JAVA_HOME=$(/usr/libexec/java_home)~/.bash_profile/Library/Java/JavaVirtualMachines/jdk-16.0.2.jdk/Contents/Home

    $ java -version
    java version "16.0.2" 2021-07-20
    Java(TM) SE Runtime Environment (build 16.0.2+7-67)
    Java HotSpot(TM) 64-Bit Server VM (build 16.0.2+7-67, mixed mode, sharing)
    

如何修复或解决此问题?

标签: javaapache-sparkpyspark

解决方案


推荐阅读