hadoop - 如何在没有 hadoop 的情况下安装 pyspark?
问题描述
我想安装 pyspark 但我不想使用 hadoop 因为我只想测试一些功能。我遵循了一堆网站的说明:我使用 pip 安装 pyspark,安装 jdk 8 并设置 JAVA_PATH、SPARK_HOME、PATH 变量,但它不起作用。
我的程序是:
from pyspark import *
from pyspark.sql import SparkSession
spark = SparkSession.builder.getOrCreate()
我得到了这个例外:
\Java\jdk1.8.0_291\bin\java was unexpected at this time.
Traceback (most recent call last):
File "c:\Users\ankit\Untitled-1.py", line 4, in <module>
spark = SparkSession.builder.getOrCreate()
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\sql\session.py", line 228, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 384, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 144, in __init__
SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\context.py", line 331, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "C:\Users\ankit\AppData\Local\Programs\Python\Python39\lib\site-packages\pyspark\java_gateway.py", line 108, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number
解决方案
推荐阅读
- java - 当@Configuration 时,Spring @Value 在类实现上返回 Null
- javascript - 如何访问跨域 iframe 的原始 DOM?
- sql - 如何分组并获得特定的输出
- java - java.sql.Blob 的模式验证,预期正确,但发现类型错误
- http - 在 web.xml 中阻止不需要的 HttpMethods
- c# - Cinchoo ETL 将 Class 序列化为 csv
- mysql - 优化 SQL 计数
- php - 查询使用变量,但我不知道为什么,你能帮帮我吗?
- python - Pyhton3 - 发送电子邮件失败 - “OSError: [Errno 113] No route to host”
- ios - iOS BLE 中心如何在通知模式下从外围设备接收超过 182 个字节?