首页 > 解决方案 > 在 pyspark 中使用 findspark 添加包

问题描述

我正在使用 findspark 包在我的笔记本中添加一个包。我收到此错误是否有原因(我使用的是 spark 2.3.0)

import findspark
findspark.add_packages(["org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.0"])
KeyErrorTraceback (most recent call last)
<ipython-input-2-94ec2e600525> in <module>()
      2 import sys
      3 import findspark
----> 4 findspark.add_packages(["org.apache.spark:spark-streaming-kafka-0-8_2.11:2.1.0"])
      5 #findspark.add_packages(["Azure:mmlspark:0.13"])

/home/narjunan/anaconda/envs/sparkpy27/lib/python2.7/site-packages/findspark.pyc in add_packages(packages)
    155         packages = [packages]
    156 
--> 157     os.environ["PYSPARK_SUBMIT_ARGS"] += " --packages "+ ",".join(packages)  +" pyspark-shell"
    158 
    159 def add_jars(jars):

/home/narjunan/anaconda/envs/sparkpy27/lib/python2.7/UserDict.pyc in __getitem__(self, key)
     38         if hasattr(self.__class__, "__missing__"):
     39             return self.__class__.__missing__(self, key)
---> 40         raise KeyError(key)
     41     def __setitem__(self, key, item): self.data[key] = item
     42     def __delitem__(self, key): del self.data[key]

KeyError: 'PYSPARK_SUBMIT_ARGS'

标签: apache-sparkpyspark

解决方案


推荐阅读