r - 如何为整个数据框指定 pivot_wider?
问题描述
我可以使用以下命令对特定列进行 pivot_wider:
new_df <- pivot_wider(old_df, names_from = col10, values_from = value_col, values_fn = list)
我想pivot_wider
使用数据框中的每一列(减去一个 id 列)。做这个的最好方式是什么?我应该使用循环还是有办法让这个函数获取整个数据帧?
为了澄清,使用下面的示例数据框,我可以使用上面列出的 pivot_wider 函数从 old_df 转到 new_df。我现在想从 old_df2 转到 new_df2。
old_df <- structure(list(id = c("1", "1", "2"), col10 = c("yellow",
"green", "green"), value_col = c("1", "1", "1")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
old_df2 <- structure(list(id = c("1", "1", "2"), col10 = c("yellow",
"green", "green"), col11 = c("dog",
"cat", "dog"), value_col = c("1", "1", "1")), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame"))
new_df <- pivot_wider(old_df, names_from = col10, values_from = value_col, values_fn = list)
new_df2 <- structure(list(id = c("1", "2"), yellow = c("1", "NULL"), green = c("1", "1"), dog = c("1", "1"), cat = c("1", "NULL")), row.names = c(NA, -2L), class = c("tbl_df", "tbl", "data.frame"))
解决方案
如果您想为这两列(或任意数量的列)之间的每个值使用单独的列名,您首先需要使用pivot_longer
将所有列名放入单个列中,然后使用pivot_wider
来分散它们:
library(tidyr)
old_df2 %>%
pivot_longer(!c(id, value_col), names_to = "Cols", values_to = "vals") %>%
pivot_wider(names_from = vals, values_from = value_col) %>%
select(-Cols) %>%
group_by(id) %>%
summarise(across(everything(), ~ sum(as.numeric(.x), na.rm = TRUE)))
# A tibble: 2 x 5
id yellow dog green cat
<chr> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 1 1
2 2 0 1 1 0
推荐阅读
- c# - Blazor httpClient.PostJsonAsync
- javascript - 尽管数组中存在值,但 indexOf 不起作用
- javascript - 如何检测用户在表单输入中点击了键盘上的“下一步”
- java - Corretto 是补充 BouncyCastle 还是取代它?
- asp.net-mvc - 除了本地主机(ASP.NET MVC Framework IIS Express)之外,什么都不能听
- amazon-web-services - 如何在 AWS Glue 的 Pyspark 中为每一行使用不同的架构来分解嵌套结构
- mongodb - Mongodb - 如果评论文本是引用的评论,则从评论文本中获取价值
- machine-learning - 如何查看随机森林中使用的特征?
- three.js - Threejs如何获取模型的最外层尺寸
- android - 将数据传递到服务器时进行深度链接