首页 > 解决方案 > 如何在 Vega-Lite 中向数据集添加额外字段

问题描述

我的数据集是以下形式的数组:

[
  { "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
  { ... },
  { ... }
]

有多个国家和多天。日期没有间隔。

我想DAYS_PASSED使用以下算法注入场(并随后将其用于 X 轴):

  1. 检查DAYS_PASSED同一国家前一天的值并将其分配给变量TEMP。(如果前一天不存在,则假设为0);
  2. DAYS_PASSED使用以下公式计算:
   if TEMP > 0, then DAYS_PASSED = TEMP + 1
   else-if COUNT > 100 then DAYS_PASSED = 1
   else DAYS_PASSED = 0

到目前为止,我已经在预处理步骤(在 Vega-Lite 之外)中完成了这项工作,但我想知道是否可以将计算迁移到 Vega-Lite,也许通过以某种方式插入 JavaScript 函数?

我还希望能够COUNT > 100在图表中显示 100(来自条件),以便用户可以将其调整为 200。

标签: vega-lite

解决方案


您可以通过一系列转换来做到这一点;例如:


  "transform": [
    {"calculate": "toDate(datum.DATE)", "as": "date"},
    {"calculate": "datum.COUNT < 100", "as": "pre100"},
    {
      "joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
      "groupby": ["COUNTRY"]
    },
    {
      "window": [{"op": "count", "as": "daysPassed"}],
      "groupby": ["COUNTRY"],
      "sort": [{"field": "date"}]
    },
    {"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
  ],

这是一个更完整的示例,显示了一个小型数据集(vega 编辑器):

{
  "data": {
    "values": [
      {"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
      {"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
      {"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
      {"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
      {"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
      {"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
      {"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
      {"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
      {"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
      {"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
    ]
  },
  "transform": [
    {"calculate": "toDate(datum.DATE)", "as": "date"},
    {"calculate": "datum.COUNT < 100", "as": "pre100"},
    {
      "joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
      "groupby": ["COUNTRY"]
    },
    {
      "window": [{"op": "count", "as": "daysPassed"}],
      "groupby": ["COUNTRY"],
      "sort": [{"field": "date"}]
    },
    {"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
  ],
  "concat": [
    {
      "mark": "line",
      "encoding": {
        "x": {"field": "DATE", "type": "temporal"},
        "y": {"field": "COUNT", "type": "quantitative"},
        "color": {"field": "COUNTRY", "type": "nominal"}
      }
    },
    {
      "mark": "line",
      "transform": [{"filter": "datum.daysPassed > 0"}],
      "encoding": {
        "x": {"field": "daysPassed", "type": "quantitative"},
        "y": {"field": "COUNT", "type": "quantitative"},
        "color": {"field": "COUNTRY", "type": "nominal"}
      }
    }
  ]
}

在此处输入图像描述


推荐阅读