首页 > 解决方案 > 例外:在 AWS Lambda 中部署 Spark 时,Java 网关进程在发送其端口号之前退出

问题描述

我一直在尝试使用 AWS SAM 在 AWS lambda 容器中托管 Spark NLP 模型。当我使用 sam 在本地测试容器时,local start-api它工作得非常好。但是在 AWS lambda 函数上部署时出现以下错误。

错误

[ERROR] Exception: Java gateway process exited before sending its port number
Traceback (most recent call last):
  File "/var/lang/lib/python3.8/imp.py", line 234, in load_module
    return load_source(name, filename, file)
  File "/var/lang/lib/python3.8/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 702, in _load
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 848, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/var/task/app.py", line 30, in <module>
    spark = start()
  File "/var/task/app.py", line 28, in start
    return builder.getOrCreate()
  File "/var/lang/lib/python3.8/site-packages/pyspark/sql/session.py", line 228, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 384, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 144, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/var/lang/lib/python3.8/site-packages/pyspark/context.py", line 331, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/var/lang/lib/python3.8/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
    raise Exception("Java gateway process exited before sending its port number")

码头工人文件:

FROM public.ecr.aws/lambda/python:3.8

ENV PYSPARK_PYTHON=python3.8
ENV PYSPARK_DRIVER_PYTHON=python3.8
COPY . ./


RUN yum install -y \
       java-1.8.0-openjdk \
       java-1.8.0-openjdk-devel


ENV JAVA_HOME /etc/alternatives/jre
RUN echo "export JAVA_HOME=/etc/alternatives/jre" >> ~/.bashrc
RUN pip install --no-cache-dir  pyspark==3.1.1 spark-nlp==3.0.3 pandas requests
RUN chmod 644 /var/lang/lib/python3.8/site-packages/pyspark/bin/
RUN chmod 755 /var/lang/lib/python3.8/site-packages/pyspark/bin/

ENV PYSPARK_SUBMIT_ARGS="--master local[3] pyspark-shell"

RUN echo "export PYSPARK_SUBMIT_ARGS=--master local[3] pyspark-shell" >> ~/.bashrc

CMD ["app.lambda_handler"]

谁能帮我解决这个问题?

标签: pythonamazon-web-servicespysparkaws-lambdaaws-sam

解决方案


推荐阅读