首页 > 解决方案 > 提取值出现在任意多列中的行

问题描述

假设我有两个 data.frames

name_df = read.table(text = "player_name
a
b
c
d
e
f
g", header = T)

game_df = read.table(text = "game_id winner_name loser_name
1 a b
2 b a
3 a c
4 a d
5 b c
6 c d
7 d e
8 e f
9 f a
10 g f
11 g a
12 f e
13 a d", header = T)

name_df包含 中所有winner_nameloser_name值的唯一列表game_df。我想为name_df一行中的每个人创建一个新的data.frame,如果给定的名称(例如a)出现在winner_nameloser_name列中

所以我本质上想与 合并game_dfname_df但键列 ( name) 可以出现在winner_nameloser_name中。

因此,对于 Justab最终输出将类似于:

final_df = read.table(text = "player_name game_id winner_name loser_name
a 1 a b
a 2 b a
a 3 a c
a 4 a d
a 9 f a
a 11 g a
a 13 a d
b 1 a b
b 2 b a
b 5 b c", header = T)

标签: rdplyrdata.table

解决方案


我们可以遍历 'name_df' 中的元素以获得 'player_name',循环filter来自 'game_df' 的行以获得 'winner_name' 或 'loser_name'

library(dplyr)
library(purrr)
map_dfr(setNames(name_df$player_name, name_df$player_name), 
   ~ game_df %>%
        filter(winner_name %in% .x|loser_name %in% .x), .id = 'player_name')

或者如果有很多列,请使用if_any

map_dfr(setNames(name_df$player_name, name_df$player_name), 
  ~ {
     nm1 <- .x
     game_df %>%
       filter(if_any(c(winner_name, loser_name), ~ . %in%  nm1))
      }, .id = 'player_name')

推荐阅读