apache-spark - Erro 'DataFrame' 对象没有属性 '_get_object_id'
问题描述
当我只运行选择它时,它会返回数据,但是当我将它保存在湖中时,会出现此消息
错误“DataFrame 对象”没有属性“_get_object_id”
try:
dfNovo = spark.read.format('parquet').load(dfNovo)
histCZ = spark.read.format("parquet").load(histCZ)
dfNovo = dfNovo.fillna('')
histCZ = histCZ.fillna('')
dfNovo.createOrReplaceTempView('hist_hz')
histCZ.createOrReplaceTempView('hist_cz')
spark.catalog.refreshTable("hist_hz")
spark.catalog.refreshTable("hist_cz")
c = spark.sql("""select distinct a.* from hist_hz a
left join (select * from hist_cz) b
on
a.fornecimento = b.fornecimento
and a.centro = b.centro
and a.atribuicao = b.atribuicao
and a.ped_pca = b.ped_pca
and a.transporte = b.transporte
and a.codigo_material = b.codigo_material
and a.descr_produto = b.descr_produto
and a.descr_status_pedido = b.descr_status_pedido
and a.hora_puxada = b.hora_puxada
and a.cliente = b.cliente
and a.cliente_sap = b.cliente_sap
and a.numero_nota_fiscal = b.numero_nota_fiscal
and a.data_inicio_carregamento = b.data_inicio_carregamento
and a.hora_inicio_carregamento = b.hora_inicio_carregamento
and a.dt_termino_carregamento = b.dt_termino_carregamento
and a.hora_termino_carregamento = b.hora_termino_carregamento
and a.numeroov_pedtransf = b.numeroov_pedtransf
and a.can_distrib = b.can_distrib
and a.tipo_operacao = b.tipo_operacao
and a.flagAtivo = b.flagAtivo
where a.createdDate = '02-01-2020'
and b.cliente_sap is null
""")
print(c.count())
if (c.count() >0 ):
c.write.mode('overwrite').format('parquet').option("encoding", 'UTF-8').partitionBy('data_puxada').save(histCZ)
print("Finalizado")
#print(PickingAutomatico.count())
except Exception as e:
print('Erro ',e)`` `
解决方案
您正在覆盖自己的变量。
histCZ = spark.read.format("parquet").load(histCZ)
然后使用该histCZ
变量作为保存镶木地板的位置。但此时它是一个数据框
c.write.mode('overwrite').format('parquet').option("encoding", 'UTF-8').partitionBy('data_puxada').save(histCZ)
此时histCZ
不是位置
推荐阅读
- python - curl工作时无法通过Windows 10上的python连接到本地主机服务器
- tensorflow - 如何通过字典为 Keras 或 Tensorflow 中的不平衡类设置类权重?
- python - 将dict转换为数据框
- sql - 将多个选择语句插入到临时表中
- python - YOLOv3-tiny 和 DarkNet - 2 类,但仅分类 1
- r - Shiny - 反应式过滤器功能问题
- dart - 使用intellij和dart,src下的目录不能正确识别dart文件
- c++ - (C ++)如何将当前减法减去前一个减法?
- c# - 任务未完成
- linux - 如何将 VLC 发送到我的操作系统相关 VLCJ 应用程序中?