首页 > 解决方案 > 在设置条件后按行号提取数据

问题描述

我有一个从 excel 文件导入的 data.frame,该文件已使用不规则结构以使其具有视觉吸引力,但数据不可用。它位于重复的分组数据块中,“周”一词标志着一个新条目。我正在创建一个代码来提取相关数据。这是一个mwe

df = data.frame(x1 = c("Week", "Day", "Exercise", NA, NA, "Walk","Week", "Day", "Exercise", NA, NA, "Run"),
                x2 = c("1", "1",NA, "Advice", NA,NA,"1", "2",NA, "Advice", NA,NA) )
df
                x1     x2
1      Week      1
2       Day      1
3  Exercise   <NA>
4      <NA> Advice
5      <NA>   <NA>
6      Walk   <NA>
7      Week      1
8       Day      2
9  Exercise   <NA>
10     <NA> Advice
11     <NA>   <NA>
12      Run   <NA>

首先,我想创建适用于相应条目的“周”和“日”变量:

df = df%>%
  mutate(Week = case_when(x1 == "Week" ~ x2 ),
         Day =  case_when(x1 == "Day" ~ x2))%>%
  fill(c(Week, Day), .direction= "downup") # fill missing values (NA) with the preceding present value 

df
         x1     x2 Week Day
1      Week      1    1   1
2       Day      1    1   1
3  Exercise   <NA>    1   1
4      <NA> Advice    1   1
5      <NA>   <NA>    1   1
6      Walk   <NA>    1   1
7      Week      1    1   1
8       Day      2    1   2
9  Exercise   <NA>    1   2
10     <NA> Advice    1   2
11     <NA>   <NA>    1   2
12      Run   <NA>    1   2

然后我想提取已经完成的练习,它总是x1.

结果应该是这样的

x1       x2     Week  Day   Exercise
   <fct>    <fct>  <fct> <fct> <fct>   
 1 Week     1      1     1     Walk    
 2 Day      1      1     1     Walk    
 3 Exercise NA     1     1     Walk    
 4 NA       Advice 1     1     Walk    
 5 NA       NA     1     1     Walk    
 6 Walk     NA     1     1     Walk    
 7 Week     1      1     1     Walk    
 8 Day      2      1     2     Run     
 9 Exercise NA     1     2     Run     
10 NA       Advice 1     2     Run     
11 NA       NA     1     2     Run     
12 Run      NA     1     2     Run  

如何在条件后指定行号并从该行的指定列中提取数据?

标签: rrowrow-number

解决方案


我喜欢dplyr解决方案,搜索后找到了功能nth

df =df%>%
  group_by(Week, Day)%>%
  mutate(Exercise = nth(x1,(which(str_detect(x1, "Exercise"))+3)))

which编号str_detect找到“锻炼”的行。+3 to move on 3 nth可用于查找该行号中的数据x1


推荐阅读