python - How to assign a colum value as a addition of column value and a constant in pyspark?
问题描述
I need to create a column called Sea freight days + Buffer
and the values are assigned as final_df6['No of days take if sea freight']+destuff_buffer
if the Mode is equal to AIR.Here destuff buffer is a constant.
destuff_buffer = 4
final_df6 = final_df6.withColumn('Sea freight days + Buffer',
when(col("Mode")=='AIR',final_df6['No of days take if sea freight']+destuff_buffer).otherwise(np.nan)
)
But I am getting following error.
Traceback (most recent call last): File
"/opt/amazon/bin/runscript.py", line
67, in <module> runpy.run_path(script, run_name='__main__') File "/usr/lib64/python3.7/runpy.py", line 261, in run_path
code, fname = _get_code_from_file(run_name, path_name) File
"/usr/lib64/python3.7/runpy.py", line 236, in _get_code_from_file
code = compile(f.read(), fname, 'exec') File "/tmp/ACOEtest",
line 165 destuff_buffer = 4 ^ SyntaxError: invalid syntax
During handling of the above exception, another exception
occurred: Traceback (most recent call last): File "/opt/amazon/bin/runscript.py", line 100, in <module> while"runpy.py" in new_stack.tb_frame.f_code.co_filename:
AttributeError: 'NoneType' object has no attribute 'tb_frame'
解决方案
c = 40
df = spark.createDataFrame(spark.sparkContext.parallelize([('AIR',1),('NONAIR',5)]),['mode','d'])
df = df.withColumn('mycol', when(df.mode=='AIR', df.d+c).otherwise(None))
df.show()
+------+---+-----+
| mode| d|mycol|
+------+---+-----+
| AIR| 1| 41|
|NONAIR| 5| null|
+------+---+-----+
推荐阅读
- git - GIT 变基问题
- python - 如何将字典一一追加到json文件中?
- reactjs - 在 Redux 中,为什么要检查所有 reducer 是否要执行一个案例?
- escaping - 在 loadrunner 中发送带引号的文件名参数
- c++ - std::sqrt 是否与 C++ 中的 sqrt 相同
- python-3.x - 尝试在 TFIDF 向量上计算余弦相似度矩阵时出现内存错误
- python-3.x - python:如何获取一个班级的所有学生并将其保存到一个地方?
- kubernetes - Kubernetes CRDs - 参考现有的验证规范
- python - python中两个具有相同值的不同字符串对象
- druid - Druid 0.17 原生并行摄取与 orc 文件