pyspark - wrker.py 的 Pyspark(Jupyter)中没有模块错误
问题描述
我只需要 python 脚本中的 adblockerparser 模块。当我在使用 Jupyter 配置的 pyspark 上运行它时,它会返回以下日志:
PythonException:
An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 589, in main
func, profiler, deserializer, serializer = read_udfs(pickleSer, infile, eval_type)
File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 447, in read_udfs
udfs.append(read_single_udf(pickleSer, infile, eval_type, runner_conf, udf_index=i))
File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 254, in read_single_udf
f, return_type = read_command(pickleSer, infile)
File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 76, in read_command
command = serializer.loads(command.value)
File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'adblockparser'
虽然我已经使用 jupyter notebook 安装了这个模块!pip install adblockparser
我的 .bashrc 文件如下所示:
export SPARK_HOME=/home/student/spark/spark-3.0.2-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYSPARK_PYTHON=python3
解决方案
推荐阅读
- python - 在 JIRA python API 中,如何从问题中获取某些字段?
- javascript - 如何在 JavaScript 中存储多个值和一个键?
- c# - WPF ReactiveUI 控件 - 占用所有可用空间
- c++ - Qt sql驱动,Qt解析sql语句
- html - 如何将特定内容居中?
- docker - 来自 Jenkins docker push 抛出“未经授权:BAD_CREDENTIAL”
- javascript - React - 在父组件中调用一个函数来访问状态
- json - 如何将 600MB 大 json 文件批量插入到 elasticsearch?
- java - 如何从java中的jsonSchema2Pojo生成的文件中排除equals和hashcode中的字段
- python - 如何返回只有一个字符串元素的列表?Python