r - 从季度到年度数据
问题描述
我有一个如下所示的数据框:
head(df_HPI)
HPI 是我想转换为年度的季度指数。我有 17 个地区(即 CCAA),所以我想汇总 HPI 以包含每个地区的年度数据。我做了一些更改,但代码不起作用。
# Convert series to annual data
df_HPI <- df_HPI_original
# Replace period format
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T1","-01-01",x)})) # Q1
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T2","-04-01",x)})) # Q2
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T3","-07-01",x)})) # Q3
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T4","-10-01",x)})) # Q4
# Convert column into a date
df_HPI$Periodo <- as.Date(df_HPI$Periodo)
# Aggregate to annual data
df_HPI %>%
mutate(Year=year(Periodo),
Quarter=quarter(Periodo),
Finyear = ifelse(Quarter <= 2, Year, Year+1)) %>%
group_by(Finyear, CCAA) %>%
summarise(HPIy=mean(HPI))
在最后一步,程序说参数不合逻辑并返回 NA。
解决方案
问题是HPI
当您通过gsub
. 因此,您必须将其转换回数字。尝试这个:
library(dplyr)
library(lubridate)
set.seed(42)
# Example data
quarters <- paste0("T", c(1:4))
years <- c("2019", "2020")
dates <- c(paste0(years[[1]], quarters), paste0(years[[2]], quarters))
df_HPI <- data.frame(
Periodo = rep(dates, 2),
CCAA = c(rep("Region1", 8), rep("Region2", 8)),
HPI = runif(16)
)
head(df_HPI)
#> Periodo CCAA HPI
#> 1 2019T1 Region1 0.9148060
#> 2 2019T2 Region1 0.9370754
#> 3 2019T3 Region1 0.2861395
#> 4 2019T4 Region1 0.8304476
#> 5 2020T1 Region1 0.6417455
#> 6 2020T2 Region1 0.5190959
# Replace period format
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T1","-01-01",x)})) # Q1
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T2","-04-01",x)})) # Q2
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T3","-07-01",x)})) # Q3
df_HPI <- data.frame(sapply(df_HPI, function(x) {gsub("T4","-10-01",x)})) # Q4
# Convert column into a date
df_HPI$Periodo <- as.Date(df_HPI$Periodo)
# Problem: HPI was converted to a factor
class(df_HPI$HPI)
#> [1] "factor"
# Solution: Convert back to numeric
df_HPI$HPI <- as.numeric(as.character(df_HPI$HPI))
# Aggregate to annual data
df_HPI %>%
mutate(Year=year(Periodo),
Quarter=quarter(Periodo),
Finyear = ifelse(Quarter <= 2, Year, Year+1)) %>%
group_by(Finyear, CCAA) %>%
summarise(HPIy=mean(HPI))
#> # A tibble: 6 x 3
#> # Groups: Finyear [3]
#> Finyear CCAA HPIy
#> <dbl> <fct> <dbl>
#> 1 2019 Region1 0.926
#> 2 2019 Region2 0.681
#> 3 2020 Region1 0.569
#> 4 2020 Region2 0.592
#> 5 2021 Region1 0.436
#> 6 2021 Region2 0.701
由reprex 包(v0.3.0)于 2020-04-04 创建
推荐阅读
- c# - 不同的行为 Chrome 与 Edge
- python - Master请求保存Arduino Slave的注册表值时如何解决Pymodbus异常
- appium - 申请在
不存在或无法使用 Appium 访问 - javascript - 是否可以通过 PDF Javascript 进行 XSS 攻击?
- angular - Angular:带有注入子组件的结构指令不会使用指令的@Input()更改该组件onChanges
- sql - 查找在每个科目中获得最高分的学生的姓名
- python - 如果在声明之前调用 logging.info(),Python 中的 logging.basicConfig 将不起作用
- python - 如何使用 factory_formset django 更新相同类型的多个对象?
- mysql - 从表中删除数据后如何自动递增1
- reactjs - 返回数组 react-redux