首页 > 解决方案 > 如何在没有 hadoop 的情况下安装 pyspark?

问题描述

我想安装 pyspark 但我不想使用 hadoop 因为我只想测试一些功能。我遵循了一堆网站的说明:我使用 pip 安装 pyspark,安装 jdk 8 并设置 JAVA_PATH、SPARK_HOME、PATH 变量,但它不起作用。

我的程序是:

from pyspark import *
from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

我得到了这个例外:

\Java\jdk1.8.0_291\bin\java was unexpected at this time.
Traceback (most recent call last):
  File "c:\Users\ankit\Untitled-1.py", line 4, in <module>
    spark = SparkSession.builder.getOrCreate()
  File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\sql\session.py", line 228, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 384, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 331, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\java_gateway.py", line 108, in launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

标签: hadooppyspark

解决方案


推荐阅读