首页 > 解决方案 > 在 docker 映像中运行 zeppelin 时如何为解释器设置用户解释?

问题描述

我在zeppelin docker image 的基础上设置了一个 docker image。现在我将自己的配置打包到我自己的 docker 映像中。我已经将它连接到 LDAP 以进行登录和用户模拟。像这样,覆盖 zeppelin-env.sh 中的 ZEPPELINIMPERSONATECMD。

whoami现在使用 sh 解释器运行正常。并且运行id还会显示来自 ldap 的所有正确用户信息。

但是,当我将 python 解释器切换到每个用户隔离设置并打开用户解释时,它将失败,并在注释本身中出现 ImportError[1] 响应。服务器日志如下所示[2],我将用户名替换为user_name.

我试过不覆盖 zeppelin-env.sh 中的 ZEPPELINIMPERSONATECMD。这只会在运行任何东西时导致运行时异常[3]。

我尝试将 /zeppelin/interpreter/python/py4j-0.9.2/src/py4j 文件夹复制到 /tmp ,但这只会在python not responding10 秒后完成。

有人对如何以登录用户身份运行 python 解释器有任何想法吗?

[1]:

Traceback (most recent call last):
  File "/tmp/zeppelin_python-702917387527627656.py", line 20, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ImportError: No module named py4j.java_gateway
Traceback (most recent call last):
  File "/tmp/zeppelin_python-702917387527627656.py", line 20, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ImportError: No module named py4j.java_gateway
python is not responding

[2]:

 INFO [2019-08-14 15:13:14,716] ({pool-2-thread-2} ShellScriptLauncher.java[launch]:48) - Launching Interpreter: python
 INFO [2019-08-14 15:13:14,727] ({pool-2-thread-2} RemoteInterpreterManagedProcess.java[start]:115) - Thrift server for callback will start. Port: 40221
 INFO [2019-08-14 15:13:14,738] ({pool-2-thread-2} RemoteInterpreterManagedProcess.java[start]:190) - Run interpreter process [/zeppelin/bin/interpreter.sh, -d, /zeppelin/interpreter/python, -c, 172.17.0.3, -p, 40221, -r, :, -u, user_name, -l, /zeppelin/local-repo/python, -g, python]
 INFO [2019-08-14 15:13:16,400] ({pool-7-thread-1} RemoteInterpreterManagedProcess.java[callback]:123) - RemoteInterpreterServer Registered: CallbackInfo(host:172.17.0.3, port:40445)
 INFO [2019-08-14 15:13:16,440] ({pool-2-thread-2} RemoteInterpreter.java[call]:168) - Create RemoteInterpreter org.apache.zeppelin.python.PythonInterpreter
 INFO [2019-08-14 15:13:16,540] ({pool-2-thread-2} RemoteInterpreter.java[call]:168) - Create RemoteInterpreter org.apache.zeppelin.python.IPythonInterpreter
 INFO [2019-08-14 15:13:16,544] ({pool-2-thread-2} RemoteInterpreter.java[call]:168) - Create RemoteInterpreter org.apache.zeppelin.python.PythonInterpreterPandasSql
 INFO [2019-08-14 15:13:16,545] ({pool-2-thread-2} RemoteInterpreter.java[call]:168) - Create RemoteInterpreter org.apache.zeppelin.python.PythonCondaInterpreter
 INFO [2019-08-14 15:13:16,547] ({pool-2-thread-2} RemoteInterpreter.java[call]:168) - Create RemoteInterpreter org.apache.zeppelin.python.PythonDockerInterpreter
 INFO [2019-08-14 15:13:16,549] ({pool-2-thread-2} RemoteInterpreter.java[call]:142) - Open RemoteInterpreter org.apache.zeppelin.python.PythonInterpreter
 INFO [2019-08-14 15:13:16,549] ({pool-2-thread-2} RemoteInterpreter.java[pushAngularObjectRegistryToRemote]:436) - Push local angular object registry from ZeppelinServer to remote interpreter group python:user_name:
 WARN [2019-08-14 15:13:27,703] ({pool-2-thread-2} NotebookServer.java[afterStatusChange]:2316) - Job 20190814-151311_1784127416 is finished, status: ERROR, exception: null, result: %text Traceback (most recent call last):
  File "/tmp/zeppelin_python-4627212054430132450.py", line 20, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ImportError: No module named py4j.java_gateway

%text Traceback (most recent call last):
  File "/tmp/zeppelin_python-4627212054430132450.py", line 20, in <module>
    from py4j.java_gateway import java_import, JavaGateway, GatewayClient
ImportError: No module named py4j.java_gateway

%text python is not responding
 INFO [2019-08-14 15:13:27,713] ({pool-2-thread-2} VFSNotebookRepo.java[save]:196) - Saving note:2EJWQC1Y4
 INFO [2019-08-14 15:13:27,715] ({pool-2-thread-2} SchedulerFactory.java[jobFinished]:120) - Job 20190814-151311_1784127416 finished by scheduler org.apache.zeppelin.interpreter.remote.RemoteInterpreter-python:user_name:-shared_session

[3]:

java.lang.RuntimeException: ssh: connect to host localhost port 22: Cannot assign requested address


    at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:205)
    at org.apache.zeppelin.interpreter.ManagedInterpreterGroup.getOrCreateInterpreterProcess(ManagedInterpreterGroup.java:64)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getOrCreateInterpreterProcess(RemoteInterpreter.java:111)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.internal_create(RemoteInterpreter.java:164)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:132)
    at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:299)
    at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:407)
    at org.apache.zeppelin.scheduler.Job.run(Job.java:188)
    at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:315)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

标签: apache-zeppelin

解决方案


我设法通过添加来解决这个问题:

导出 PYTHONPATH=/zeppelin/interpreter/python/py4j-0.9.2/src

在 zeppelin-env.sh 中。

之后,翻译运行良好。


推荐阅读