首页 > 解决方案 > 我们可以展平 Hive 表中包含 Json 作为值的列吗?

问题描述

我有一个带有 Json 值的配置单元列“事件”。我怎样才能展平这个 Json 以创建一个配置单元表,其中列作为 Json 的关键字段。它甚至可能吗?ex-我需要配置单元表列是事件、开始日期、id、具有相应值的详细信息。

| 活动 |

|[{"start_date":20201230,"id":"3245ret","details":"Imp"},{"start_date":20201228,"id":"3245rtr","details":"NoImp"}] |

|[{"start_date":20191230,"id":"3245ret","details":"vImp"},{"start_date":20191228,"id":"3245rwer","details":"NoImp"}] |

标签: sqljsonhadoophivehiveql

解决方案


演示:

select events, 
get_json_object(element,'$.id') as id,
get_json_object(element,'$.start_date') as start_date,
get_json_object(element,'$.details') as details
from
(
select '[{"start_date":20201230,"id":"3245ret","details":"Imp"},{"start_date":20201228,"id":"3245rtr","details":"NoImp"}]' as events
union all 
select '[{"start_date":20191230,"id":"3245ret","details":"vImp"},{"start_date":20191228,"id":"3245rwer","details":"NoImp"}]' as events
) s lateral view outer explode (split(regexp_replace(events, '\\[|\\]',''),'(?<=\\}),(?=\\{)')) e as element

初始字符串由大括号之间的逗号分隔,(请参阅此处的解释),数组分解为横向视图和 JSON 对象解析使用get_json_object

结果:

 events                                                                                                             id       start_date details
[{"start_date":20201230,"id":"3245ret","details":"Imp"},{"start_date":20201228,"id":"3245rtr","details":"NoImp"}]   3245ret  20201230  Imp
[{"start_date":20201230,"id":"3245ret","details":"Imp"},{"start_date":20201228,"id":"3245rtr","details":"NoImp"}]   3245rtr  20201228  NoImp
[{"start_date":20191230,"id":"3245ret","details":"vImp"},{"start_date":20191228,"id":"3245rwer","details":"NoImp"}] 3245ret  20191230  vImp
[{"start_date":20191230,"id":"3245ret","details":"vImp"},{"start_date":20191228,"id":"3245rwer","details":"NoImp"}] 3245rwer 20191228  NoImp

推荐阅读