首页 > 解决方案 > wrker.py 的 Pyspark(Jupyter)中没有模块错误

问题描述

我只需要 python 脚本中的 adblockerparser 模块。当我在使用 Jupyter 配置的 pyspark 上运行它时,它会返回以下日志:

PythonException: 
  An exception was thrown from the Python worker. Please see the stack trace below.
Traceback (most recent call last):
  File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 589, in main
    func, profiler, deserializer, serializer = read_udfs(pickleSer, infile, eval_type)
  File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 447, in read_udfs
    udfs.append(read_single_udf(pickleSer, infile, eval_type, runner_conf, udf_index=i))
  File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 254, in read_single_udf
    f, return_type = read_command(pickleSer, infile)
  File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/worker.py", line 76, in read_command
    command = serializer.loads(command.value)
  File "/home/student/spark/spark-3.0.2-bin-hadoop2.7/python/lib/pyspark.zip/pyspark/serializers.py", line 458, in loads
    return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'adblockparser'

虽然我已经使用 jupyter notebook 安装了这个模块!pip install adblockparser

我的 .bashrc 文件如下所示:

export SPARK_HOME=/home/student/spark/spark-3.0.2-bin-hadoop2.7
export PATH=$PATH:$SPARK_HOME/bin
export PYSPARK_DRIVER_PYTHON="jupyter"
export PYSPARK_DRIVER_PYTHON_OPTS="notebook"
export PYSPARK_PYTHON=python3

标签: pysparkapache-spark-sqljupyter-notebook

解决方案


推荐阅读