首页 > 解决方案 > Presto kafka 连接器,使用 json 数组解析/读取 json 消息

问题描述

我正在使用 presto 阅读 Kafka 主题。该消息具有以下结构:

{
    "id": 1,
    "username": "John",
    "items":[
        {
            "name": "item1",
            "quantity": 3
        },
        {
            "name": "item2",
            "quantity": 1
        }
    ]
}

要解释架构并将其转换为列,请遵循https://prestodb.io/docs/current/connector/kafka-tutorial.html#step-6-map-all-the-values-from-the-topic-message -onto-columns,我已经定义了一个模式/etc/kafka/default.trialtable.json来将主题消息中的所有值映射到列上,如下所示:

{    
    "tableName": "trialtable",
    "schemaName": "default",
    "topicName": "trialtable",
    "message": {
        "dataFormat": "json",
        "fields": [
            {
                "name": "id",
                "mapping": "id",
                "type": "BIGINT"
            },
            {
                "name": "username",
                "mapping": "username",
                "type": "VARCHAR"
            },
            {
                "name": "items",
                "mapping": "items",
                "type": "VARCHAR"
            }
        ]
    }    
}

这产生

id | username |                       items
----+----------+---------------------------------------------------
 1  | "John"   | [{"item":1,"quantity":3},{"item":5,"quantity":6}]
 2  | "Paul"   | [{"item":2,"quantity":2},{"item":3,"quantity":7}]

,即字段idusername从kafka消息中提取并放入各个列。但是,我不清楚如何提取items列中的数据以自动生成

id | username | quantity | name
----+----------+----------+------
 1  | "John"   | 3        | item1
 1  | "John"   | 1        | item2

. 这可以通过查询来实现

SELECT
   id,
   username
   itm['name'] as name,
   itm['quantity'] as quantity
FROM
   (
      SELECT
         id,
         username,
         CAST(JSON_PARSE(items) as ARRAY<MAP<VARCHAR,VARCHAR>>) as t
      FROM kafka.default.trialtable
   )
CROSS JOIN UNNESTED(t) as items(itm);

这是相当麻烦的。是否可以设置配置文件/etc/kafka/default.trialtable.json以便自动执行此转换?我试图将 type 设置为"ARRAY<JSON>"result unsupported column type 'array(json)' for column 'items'。如果 type 设置为 ,结果相同"ARRAY<MAP<VARCHAR, VARCHAR>>"

有人可以帮助我吗?我想获得类似的东西

id | username |                       items
----+----------+---------------------------------------------------
 1  | "John"   | [{"item":1,"quantity":3},{"item":5,"quantity":6}]
 2  | "Paul"   | [{"item":2,"quantity":2},{"item":3,"quantity":7}]

仅在catalog/kafka.

标签: jsonapache-kafkapresto

解决方案


推荐阅读