r - 如何使用 R 中的代码数据帧的信息自动重新编码
问题描述
我将变量v1
和v2
数据帧重新编码df
到df2
. 但是我有几个变量要重新编码,如果我将重新编码信息放入另一个数据帧 ( df3
) 并使用某种循环使用 info from 重新编码 df会很容易df3
。我尝试了几种解决方案都没有成功。
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
df <- data.frame(
v1 = c(1, 1, 2, 3),
v2 = c(2, 3, 1, 1)
)
df <- df %>%
mutate(v1 = factor(v1, levels = c(1,2,3), labels = c("red", "blue", "green"))) %>%
mutate(v2 = factor(v2, levels = c(1,2,3), labels = c("white", "pale", "black")))
# recoding df using levels
df2 <- df %>%
mutate(v1=case_when(v1 %in% levels(v1)[1:2] ~ "Bad",
v1 %in% levels(v1)[3] ~ "Good")) %>%
mutate(v2=case_when(v2 %in% levels(v2)[1] ~ "Low",
v2 %in% levels(v2)[2:3] ~ "High"))
# df3 contains transformation codes for df
# I want to use this info to automate recoding of df
df3 <- data.frame(
vs = c("v1", "v1", "v2", "v2"), # variables
ls = c("1:2", "3", "1", "2:3"), # levels
lb = c("Bad", "Good", "Low", "High") # new labels
)
由reprex 包于 2020-10-02 创建(v0.3.0)
解决方案
您可以扩展ls
列中的序列df3
并为每个数字创建单独的行。
library(dplyr)
library(tidyr)
df4 <- df3 %>%
rowwise() %>%
mutate(new_ls = list(eval(parse(text = ls)))) %>%
unnest(new_ls) %>%
select(-ls)
df4
# vs lb new_ls
# <chr> <chr> <dbl>
#1 v1 Bad 1
#2 v1 Bad 2
#3 v1 Good 3
#4 v2 Low 1
#5 v2 High 2
#6 v2 High 3
获取df
长格式并加入df4
并以宽格式获取数据
df %>%
pivot_longer(cols = everything(), names_to = 'vs', values_to = 'new_ls') %>%
left_join(df4, by = c('vs', 'new_ls')) %>%
group_by(vs) %>%
mutate(new_ls = row_number()) %>%
pivot_wider(names_from = vs, values_from = lb) %>%
select(-new_ls)
# v1 v2
# <chr> <chr>
#1 Bad High
#2 Bad High
#3 Bad Low
#4 Good Low
推荐阅读
- bash - Using printf %q to pass arguments with spaces
- angular - Ionic 3 Firestore Document not deleting
- c++ - How to call the operator in a template struct
- azure - AcquireTokenAsync function does not return any response
- reactjs - 使用 Hook 更新上下文状态值
- powerbi - Filter by two criteria
- angular - angularfxlayout div alignment issue
- spark-structured-streaming - 输出数据集中的 Spark Structured Stream null
- powershell - Select String From Text File and Create variable
- jquery - 计算ajax调用的产品价格