首页 > 解决方案 > For循环改变多列

问题描述

我有一个songs太大的小标题,无法在这里分享。另外,没关系;该问题适用于任何只有dbl值的小标题。

这个想法是我之前选择了一行。它可以是其中任何一个,而无需任何先前的知识。我做的第一件事就是过滤掉它:

songs2 <- songs %>%
  anti_join(choice)

这行得通。

顺便说一句,choice有单排。

现在,我创建了一个名为 的第二个tibble(第三个,但在这篇文章中是第二个) dist,它只有dbl值(因此与 共享列choice)。我想从 中choice的每一行中减去 中的值dist

我试着写这个:

for (i in seq_along(distUseful)) {
  dist <- dist %>%
    mutate_(distUseful[i] = (.data[[i]] - choice[[i]]))
}

但它不起作用:

> for (i in seq_along(distUseful)) {
+   dist <- dist %>%
+     mutate_(distUseful[i] = (.data[[i]] - choice[[i]]))
Error: unexpected '=' in:
"  dist <- dist %>%
    mutate_(distUseful[i] ="
> }
Error: unexpected '}' in "}"

编辑:这是评论中要求的前 10 行songs2

structure(list(acousticness = c(0.991, 0.643, 0.993, 0.000173, 
0.295, 0.996, 0.992, 0.996, 0.996, 0.00682), artists = c("['Mamie Smith']", 
"[\"Screamin' Jay Hawkins\"]", "['Mamie Smith']", "['Oscar Velazquez']", 
"['Mixe']", "['Mamie Smith & Her Jazz Hounds']", "['Mamie Smith']", 
"['Mamie Smith & Her Jazz Hounds']", "['Francisco Canaro']", 
"['Meetya']"), danceability = c(0.598, 0.852, 0.647, 0.73, 0.704, 
0.424, 0.782, 0.474, 0.469, 0.571), duration_ms = c(168333, 150200, 
163827, 422087, 165224, 198627, 195200, 186173, 146840, 476304
), energy = c(0.224, 0.517, 0.186, 0.798, 0.707, 0.245, 0.0573, 
0.239, 0.238, 0.753), explicit = c(FALSE, FALSE, FALSE, FALSE, 
TRUE, FALSE, FALSE, FALSE, FALSE, FALSE), id = c("0cS0A1fUEUd1EW3FcF8AEI", 
"0hbkKFIJm7Z05H8Zl9w30f", "11m7laMUgmOKqI3oYzuhne", "19Lc5SfJJ5O1oaxY0fpwfh", 
"2hJjbsLCytGsnAHfdsLejp", "3HnrHGLE9u2MjHtdobfWl9", "5DlCyqLyX2AOVDTjjkDZ8x", 
"02FzJbHtqElixxCmrpSCUa", "02i59gYdjlhBmbbWhf8YuK", "06NUxS2XL3efRh0bloxkHm"
), instrumentalness = c(0.000522, 0.0264, 1.76e-05, 0.801, 0.000246, 
0.799, 1.61e-06, 0.186, 0.96, 0.873), key = c(5, 5, 0, 2, 10, 
5, 5, 9, 8, 8), liveness = c(0.379, 0.0809, 0.519, 0.128, 0.402, 
0.235, 0.176, 0.195, 0.149, 0.092), loudness = c(-12.628, -7.261, 
-12.098, -7.311, -6.036, -11.47, -12.453, -9.712, -18.717, -6.943
), mode = c(0, 0, 1, 1, 0, 1, 1, 1, 1, 1), name = c("Keep A Song In Your Soul", 
"I Put A Spell On You", "Golfing Papa", "True House Music - Xavier Santos & Carlos Gomix Remix", 
"Xuniverxe", "Crazy Blues - 78rpm Version", "Don't You Advertise Your Man", 
"Arkansas Blues", "La Chacarera - Remasterizado", "Broken Puppet - Original Mix"
), popularity = c(12, 7, 4, 17, 2, 9, 5, 0, 0, 0), release_date = c("1920", 
"1920-01-05", "1920", "1920-01-01", "1920-10-01", "1920", "1920", 
"1920", "1920-07-08", "1920-01-01"), speechiness = c(0.0936, 
0.0534, 0.174, 0.0425, 0.0768, 0.0397, 0.0592, 0.0289, 0.0741, 
0.0446), tempo = c(149.976, 86.889, 97.6, 127.997, 122.076, 103.87, 
85.652, 78.784, 130.06, 126.993), valence = c(0.634, 0.95, 0.689, 
0.0422, 0.299, 0.477, 0.487, 0.366, 0.621, 0.119), year = c(1920, 
1920, 1920, 1920, 1920, 1920, 1920, 1920, 1920, 1920)), row.names = c(NA, 
-10L), class = c("tbl_df", "tbl", "data.frame"))

这是choice

structure(list(acousticness = 0.511, danceability = 0.403, duration_ms = 117395, 
    instrumentalness = 0.896, liveness = 0.108, loudness = -8.126, 
    popularity = 65, speechiness = 0.0514, tempo = 135.047, valence = 0.192), row.names = c(NA, 
-1L), class = c("tbl_df", "tbl", "data.frame"))

最后:

distUseful <- c("acousticness", "danceability", "duration_ms", "duration_ms", "instrumentalness", "liveness", "loudness", "popularity", "speechiness", "tempo", "valence")

编辑 2:只是事后的想法:如果您采用我之前引用的循环并查看它在单次迭代中的工作方式(您选择变量),它就可以工作。事实上,问题在于第一部分,distUseful[i] =根据错误消息和代码。

编辑3:例如,如果仅对第一列执行此操作,则会发生以下情况(因此第一个是正确的,其余的没有改变):

> dist %>%
+     mutate(acousticness = (dist[[1]] - choice[[1]]))
# A tibble: 174,388 x 10
   acousticness danceability duration_ms instrumentalness liveness loudness popularity speechiness tempo valence
          <dbl>        <dbl>       <dbl>            <dbl>    <dbl>    <dbl>      <dbl>       <dbl> <dbl>   <dbl>
 1        0.48         0.598      168333       0.000522     0.379    -12.6          12      0.0936 150.   0.634 
 2        0.132        0.852      150200       0.0264       0.0809    -7.26          7      0.0534  86.9  0.95  
 3        0.482        0.647      163827       0.0000176    0.519    -12.1           4      0.174   97.6  0.689 
 4       -0.511        0.73       422087       0.801        0.128     -7.31         17      0.0425 128.   0.0422
 5       -0.216        0.704      165224       0.000246     0.402     -6.04          2      0.0768 122.   0.299 
 6        0.485        0.424      198627       0.799        0.235    -11.5           9      0.0397 104.   0.477 
 7        0.481        0.782      195200       0.00000161   0.176    -12.5           5      0.0592  85.7  0.487 
 8        0.485        0.474      186173       0.186        0.195     -9.71          0      0.0289  78.8  0.366 
 9        0.485        0.469      146840       0.96         0.149    -18.7           0      0.0741 130.   0.621 
10       -0.504        0.571      476304       0.873        0.092     -6.94          0      0.0446 127.   0.119 

标签: rdplyr

解决方案


假设这dist是一个 tibble 并且choice是一个值向量(其长度等于 中的列数dist),我会尝试这样的事情:

amend_row <- function(amend_vals, ...) {
   ... - amend_vals
}

purrr::pmap(dist, ~ amend_row(amend_vals = choice, .)) %>%
   do.call(what = rbind, args = .) %>%
   as_tibble() %>% 
   purrr::set_names(nm = colnames(dist))

推荐阅读