首页 > 解决方案 > using cumax for making a new column

问题描述

I have 4 columns in my data set. first number of family, second number of persons in that family, col3 is the number of trips that a person make, col4 the place of the activity of that person and col5 is something that I am want to make .

 family   persons    trip      activity       
     1     1        1         home
     1     1        2         shopping
     1     1        3          home
     1     1        4         eating
     1     1        5         friends
     1     1        6          home
     1     2        1          home
     1     2        2           eating
     1     2        3           work
     1     2        3           shopping

as you can see in the above data set we have information of 2 persons in the first family. first person has 6 trips and the second one 3 trips. home and work is important in my analysis. I want to make loop based on home and work activity. in other words each loop is a set of activities that start at home and finish at home or work. for the first person we have 2 loops :

  first loop: home-> shopping -> home
  second loop: home -> eating -> freinds ->home

for second person we have 2 loops

   loop: home -> eating ->work
   loop2: work ->shopping

I want to add a column that determine the number of loop in this data set like this :

 family persons   trip       place       loop
   1      1        1         home          1
   1      1        2        shopping       1
   1      1        3         home          1
   1      1        4         eating        2  
   1      1        5         friends       2 
   1      1        6          home         2
   1      2        1          home         1
   1      2        2           eating      1
   1      2        3           work        1
   1      2        3           shopping    2

I have the following code:

vals <- c("work","home")

library(dplyr)
 df9<-df1 %>% 
  group_by(SAMPN,PERNO) %>% 
  mutate(loop = cummax(lag(1 + (TPURP %in% vals), default = 1)))

But it doesn't give me the correct output. when there are 2 home for one person it doesn't change the loop. For example for first person it is all 1 in loop.

标签: rdataframe

解决方案


推荐阅读