r - 数据框列表,但每个列表项都有一个 df 和另一个值,无法让 bind_rows() 组合它们
问题描述
数据框列表:
mydiamonds <- diamonds %>%
group_by(cut, color) %>%
mutate(cumprice = cumsum(price)) %>%
mutate(lag_cumprice = lag(cumprice)) %>%
na.omit(.) %>%
group_split %>%
map(~ list(dta = ., initial_val = min(.$cumprice)))
如果这只是数据框的列表,没有别的,我想我可以将它们组合成一个数据框,只需:
mydiamonds %>% bind_rows %>% glimpse
但是,这会产生错误:
Error: Internal error in `vec_assign()`: `value` should have been recycled to fit `x`.
大概是因为它不是一个简单的数据框列表,因为每个列表项都有一个 df 和一个数值:
mydiamonds[[1]]
$dta
# A tibble: 162 x 12
carat cut color clarity depth table price x y z cumprice lag_cumprice
<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <int> <int>
1 0.71 Fair D VS2 56.9 65 2858 5.89 5.84 3.34 5706 2848
2 0.9 Fair D SI2 66.9 57 2885 6.02 5.9 3.99 8591 5706
3 1 Fair D SI2 69.3 58 2974 5.96 5.87 4.1 11565 8591
4 1.01 Fair D SI2 64.6 56 3003 6.31 6.24 4.05 14568 11565
5 0.73 Fair D VS1 66 54 3047 5.56 5.66 3.7 17615 14568
6 0.71 Fair D VS2 64.7 58 3077 5.61 5.58 3.62 20692 17615
7 0.91 Fair D SI2 62.5 66 3079 6.08 6.01 3.78 23771 20692
8 0.9 Fair D SI2 65.9 59 3205 6 5.95 3.94 26976 23771
9 0.9 Fair D SI2 66 58 3205 6 5.97 3.95 30181 26976
10 0.9 Fair D SI2 64.7 54 3205 6.1 6.04 3.93 33386 30181
# … with 152 more rows
$initial_val
[1] 5706
有没有办法告诉 bind_rows() 只使用$dta
每个列表项的一部分?
解决方案
我们pluck
只需要tibble
通过循环遍历元素,指定行绑定所有元素map
的后缀_dfr
library(purrr)
mydiamonds_full <- map_dfr(mydiamonds, pluck, 'dta')
-检查
glimpse(mydiamonds_full)
Rows: 53,905
Columns: 12
$ carat <dbl> 0.71, 0.90, 1.00, 1.01, 0.73, 0.71, 0.91, 0.90, 0.90, 0.90, 0.90, 0.90, 0.25, 0.70, 1.00, 0.90, 0.95, 0.90, 0.90, 1.00, 0.90, 0.90, 0.91, 0.91, 1.03, 0.90…
$ cut <ord> Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair, Fair…
$ color <ord> D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D, D…
$ clarity <ord> VS2, SI2, SI2, SI2, VS1, VS2, SI2, SI2, SI2, SI2, SI2, SI2, VS1, VVS2, SI2, SI1, SI2, SI2, SI2, SI2, SI1, SI1, SI1, SI1, SI2, SI1, SI2, SI2, SI1, SI2, SI2…
$ depth <dbl> 56.9, 66.9, 69.3, 64.6, 66.0, 64.7, 62.5, 65.9, 66.0, 64.7, 65.7, 64.7, 61.2, 58.5, 64.8, 66.4, 64.4, 64.9, 64.5, 65.2, 64.8, 64.5, 64.7, 65.2, 66.4, 65.7…
$ table <dbl> 65, 57, 58, 56, 54, 58, 66, 59, 58, 54, 60, 59, 55, 62, 60, 59, 60, 57, 61, 56, 59, 61, 61, 57, 56, 65, 66, 58, 61, 59, 59, 60, 57, 66, 66, 55, 56, 59, 53…
$ price <int> 2858, 2885, 2974, 3003, 3047, 3077, 3079, 3205, 3205, 3205, 3205, 3205, 563, 3296, 3304, 3382, 3384, 3473, 3473, 3634, 3689, 3689, 3730, 3730, 3743, 3751,…
$ x <dbl> 5.89, 6.02, 5.96, 6.31, 5.56, 5.61, 6.08, 6.00, 6.00, 6.10, 5.98, 6.09, 4.09, 5.72, 6.23, 5.97, 6.06, 6.03, 6.10, 6.27, 6.10, 6.05, 6.06, 6.08, 6.31, 6.06…
$ y <dbl> 5.84, 5.90, 5.87, 6.24, 5.66, 5.58, 6.01, 5.95, 5.97, 6.04, 5.93, 5.99, 4.11, 5.81, 6.18, 5.92, 6.02, 5.98, 6.00, 6.21, 6.03, 6.01, 5.99, 6.04, 6.19, 5.94…
$ z <dbl> 3.34, 3.99, 4.10, 4.05, 3.70, 3.62, 3.78, 3.94, 3.95, 3.93, 3.91, 3.91, 2.51, 3.37, 4.02, 3.95, 3.89, 3.90, 3.90, 4.07, 3.93, 3.89, 3.90, 3.95, 4.15, 3.94…
$ cumprice <int> 5706, 8591, 11565, 14568, 17615, 20692, 23771, 26976, 30181, 33386, 36591, 39796, 40359, 43655, 46959, 50341, 53725, 57198, 60671, 64305, 67994, 71683, 75…
$ lag_cumprice <int> 2848, 5706, 8591, 11565, 14568, 17615, 20692, 23771, 26976, 30181, 33386, 36591, 39796, 40359, 43655, 46959, 50341, 53725, 57198, 60671, 64305, 67994, 716…
或者也可以使用keep
只保留tibble
元素flatten
和行绑定
map_dfr(mydiamonds, ~ keep(.x, is_tibble) %>%
flatten_dfr)
# A tibble: 53,905 x 12
carat cut color clarity depth table price x y z cumprice lag_cumprice
<dbl> <ord> <ord> <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <int> <int>
1 0.71 Fair D VS2 56.9 65 2858 5.89 5.84 3.34 5706 2848
2 0.9 Fair D SI2 66.9 57 2885 6.02 5.9 3.99 8591 5706
3 1 Fair D SI2 69.3 58 2974 5.96 5.87 4.1 11565 8591
4 1.01 Fair D SI2 64.6 56 3003 6.31 6.24 4.05 14568 11565
5 0.73 Fair D VS1 66 54 3047 5.56 5.66 3.7 17615 14568
6 0.71 Fair D VS2 64.7 58 3077 5.61 5.58 3.62 20692 17615
7 0.91 Fair D SI2 62.5 66 3079 6.08 6.01 3.78 23771 20692
8 0.9 Fair D SI2 65.9 59 3205 6 5.95 3.94 26976 23771
9 0.9 Fair D SI2 66 58 3205 6 5.97 3.95 30181 26976
10 0.9 Fair D SI2 64.7 54 3205 6.1 6.04 3.93 33386 30181
# … with 53,895 more rows
或使用base R
do.call(rbind, lapply(mydiamonds, \(x) x$dta))
推荐阅读
- elasticsearch - ElasticSearch“匹配”几个词?
- powerbi - Power BI IF 介于 2 次和两个日期之间,然后日期,否则另一个日期
- arrays - 在 Matlab 中过滤二维点数组以适应标准
- javascript - 当组件获得焦点时,如何防止反应网页向上滚动?
- amazon-web-services - 如果启用 MFA 的移动设备丢失或损坏,将如何登录 aws 帐户?
- javascript - 使用 localhost 向 API 发送 POST 请求
- node.js - Nodejs REST API 函数抛出错误“回调不是函数”
- tcl - 如何在 tcl 中使用 sed 和 printf 定义文本区域
- datatable - 减少 RMarkdown 数据表之间的空间
- python - 如何使用 selenuim 解析列值及其 href