首页 > 解决方案 > 如何在 spark-on-k8s-operator 中传递带空格 pyspark 参数的字符串

问题描述

pyspark arg 之一是 sql 查询(带空格的字符串)。我试图将它作为 - \"select * from table\""select * from table"

但它没有将其视为一个完整的字符串,并且select *正在执行 bash 命令,这会破坏 SQL。

示例:上面的查询被转换为 - \"select' folder1 file1.zip from 'table\"

驱动程序日志:

PYSPARK_ARGS=
+ '[' -n 'process  --query \"select * from table\"' ']'
+ PYSPARK_ARGS='process --query \"select * from table\"'
+ R_ARGS=
+ '[' -n '' ']'
+ '[' 3 == 2 ']'
+ '[' 3 == 3 ']'
++ python3 -V
+ pyv3='Python 3.7.3'
+ export PYTHON_VERSION=3.7.3
+ PYTHON_VERSION=3.7.3
+ export PYSPARK_PYTHON=python3
+ PYSPARK_PYTHON=python3
+ export PYSPARK_DRIVER_PYTHON=python3
+ PYSPARK_DRIVER_PYTHON=python3
+ case "$SPARK_K8S_CMD" in
+ CMD=("$SPARK_HOME/bin/spark-submit" --conf "spark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS" --deploy-mode client "$@" $PYSPARK_PRIMARY $PYSPARK_ARGS)
+ exec /usr/bin/tini -s -- /opt/spark/bin/spark-submit --conf spark.driver.bindAddress=xx.xx.xx.xx --deploy-mode client --class org.apache.spark.deploy.PythonRunner file:/usr/local/bin/process_sql.py process 
--query '\"select' folder1 file1.zip from 'table\"'

有没有办法用空格、单引号或双引号安全地传递字符串参数?

标签: apache-sparkairflow

解决方案


推荐阅读