google-bigquery - how to expire partitions of existing partitioned tables in BigQuery
问题描述
I need to create a copy of the production dataset in BigQuery to the testing environment and use it to simulate the pipeline processing with new changes.
However, the production dataset is huge. so I usually want to only keep its most recent data for testing.
To do that, I would like to truncate all partitioned data that is older than 30 days in my dataset.
I tried setting partition expiration at the dataset level. it doesn't work.
So how could I do that.
解决方案
我对此做了一些测试并确认了这一点。
在数据集级别设置默认分区到期时。它仅适用于新表。对于现有的分区表,您需要在单个表级别设置分区以使其分区过期。例如:
ALTER TABLE `gcp_A.dataset_1.measurements`
SET OPTIONS (
-- Sets partition expiration to 30 days
partition_expiration_days=30
);
select min(stamp) from `gcp_A.dataset_1.measurements`
-- [result]
-- 2021-06-15 00:00:00 UTC
推荐阅读
- c# - 自定义 MasterPage 类停止使用 Bootstrap 5
- python - 在 CPU 和 GPU 之间进行网络层拆分以进行推理
- java - 给定一个数组,每个元素计数多少分
- go - 如何不为每个原型结构重复相同的字段?
- python - ModuleNotFoundError 运行使用 Setuptools 和 Wheels 从 PyPi 安装的自己的项目时
- swift - SwiftUI 深度链接导航弹跳
- java - 使用来自 Spring Boot 后端的 Angular 11 拦截器刷新 JWT 过期令牌
- c++ - 尝试打开文件 c++ 时出现 Errno 13
- android - 上传图像以训练 ML Kit 是不可能的
- javascript - 记住移动设备中的登录用户