vega-lite - 如何在 Vega-Lite 中向数据集添加额外字段
问题描述
我的数据集是以下形式的数组:
[
{ "DATE" : "2020-01-02", "COUNTRY" : "Spain", "COUNT" : 110 },
{ ... },
{ ... }
]
有多个国家和多天。日期没有间隔。
我想DAYS_PASSED
使用以下算法注入场(并随后将其用于 X 轴):
- 检查
DAYS_PASSED
同一国家前一天的值并将其分配给变量TEMP
。(如果前一天不存在,则假设为0); DAYS_PASSED
使用以下公式计算:
if TEMP > 0, then DAYS_PASSED = TEMP + 1
else-if COUNT > 100 then DAYS_PASSED = 1
else DAYS_PASSED = 0
到目前为止,我已经在预处理步骤(在 Vega-Lite 之外)中完成了这项工作,但我想知道是否可以将计算迁移到 Vega-Lite,也许通过以某种方式插入 JavaScript 函数?
我还希望能够COUNT > 100
在图表中显示 100(来自条件),以便用户可以将其调整为 200。
解决方案
您可以通过一系列转换来做到这一点;例如:
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
这是一个更完整的示例,显示了一个小型数据集(vega 编辑器):
{
"data": {
"values": [
{"DATE": "2020-02-02", "COUNTRY": "Spain", "COUNT": 50},
{"DATE": "2020-02-03", "COUNTRY": "Spain", "COUNT": 70},
{"DATE": "2020-02-04", "COUNTRY": "Spain", "COUNT": 110},
{"DATE": "2020-02-05", "COUNTRY": "Spain", "COUNT": 150},
{"DATE": "2020-02-06", "COUNTRY": "Spain", "COUNT": 200},
{"DATE": "2020-02-02", "COUNTRY": "Italy", "COUNT": 90},
{"DATE": "2020-02-03", "COUNTRY": "Italy", "COUNT": 100},
{"DATE": "2020-02-04", "COUNTRY": "Italy", "COUNT": 140},
{"DATE": "2020-02-05", "COUNTRY": "Italy", "COUNT": 190},
{"DATE": "2020-02-06", "COUNTRY": "Italy", "COUNT": 250}
]
},
"transform": [
{"calculate": "toDate(datum.DATE)", "as": "date"},
{"calculate": "datum.COUNT < 100", "as": "pre100"},
{
"joinaggregate": [{"op": "sum", "field": "pre100", "as": "offset"}],
"groupby": ["COUNTRY"]
},
{
"window": [{"op": "count", "as": "daysPassed"}],
"groupby": ["COUNTRY"],
"sort": [{"field": "date"}]
},
{"calculate": "max(0, datum.daysPassed - datum.offset)", "as": "daysPassed"}
],
"concat": [
{
"mark": "line",
"encoding": {
"x": {"field": "DATE", "type": "temporal"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
},
{
"mark": "line",
"transform": [{"filter": "datum.daysPassed > 0"}],
"encoding": {
"x": {"field": "daysPassed", "type": "quantitative"},
"y": {"field": "COUNT", "type": "quantitative"},
"color": {"field": "COUNTRY", "type": "nominal"}
}
}
]
}
推荐阅读
- c - Linux ext4 文件系统中如何使用套接字类型文件(S_IFSOCK)?
- java - 流口水比较集
- postgresql - 返回删除语句的每一行的 Postges 调用过程
- rtmp - 如何从 HLS 清单中获取 IMSC XML?
- python - 如何仅选择数据框中每个 userId 的最新日期,以及 userId 的列表?
- spring-boot - Spring Boot Quarts - 向rabbitmq发送失火
- excel - 如何使用满足两个条件的“查找”功能
- reactjs - 无法在 React 页面上的 img 标签中呈现 base 64
- spring-security - Spring Security - 记住我 cookie 无法在重启后存活
- javascript - 如果具有多个逻辑运算符的语句不起作用