首页 > 解决方案 > 之后无法在本地运行 PySpark 代码

问题描述

在完成所有必需的 spark 配置设置后,我无法运行从数据块导入的 PySpark 代码,即使我遇到如下所示的错误。请帮我解决。

代码:

# Databricks notebook source
from pyspark import SparkConf, SparkContext

# COMMAND ----------

conf = SparkConf().setAppName('Read File')

# COMMAND ----------

sc = SparkContext.getOrCreate(conf=conf)

# COMMAND ----------

text = sc.textFile('sample.txt')

# COMMAND ----------

print(text.collect())

# COMMAND ----------

sc.stop()
# COMMAND ----------

错误:

C:\Users\Abhishikth\Desktop>spark-submit 'fsc2.py'
21/10/18 08:03:34 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" org.apache.spark.SparkException: Failed to get main class in JAR with error 'File file:/C:/Users/Abhishikth/Desktop/'fsc2.py' does not exist'.  Please specify one with --class.
        at org.apache.spark.deploy.SparkSubmit.error(SparkSubmit.scala:968)
        at org.apache.spark.deploy.SparkSubmit.prepareSubmitEnvironment(SparkSubmit.scala:486)
        at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
        at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180)
        at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203)
        at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90)
        at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

标签: pythonapache-sparkpysparkrdd

解决方案


推荐阅读