首页 > 解决方案 > 在 mutate dplyr 中使用逻辑运算符

问题描述

我有一个看起来像这样的数据框:

df = data.frame(animals = c("cat; dog; bird", "dog; bird", "bird"), sentences = c("the cat is brown; the dog is barking; the bird is green and blue", "the bird is yellow and blue", "the bird is blue"),year= c("2010","2012","2001"), stringsAsFactors = F)

df$year <-  as.numeric(df$year)

> df
         animals                                                        sentences year
1       cat; dog                bird the cat is brown; the bird is green and blue 2010
2      dog; bird                    the dog is black; the bird is yellow and blue 2012
3           bird                                                 the bird is blue 2001

我想得到前 5 年(包括同年)的列句中动物的总和。

编辑

例如:第2行的动物狗和鸟,在过去5年(包括同一年)的句子列中重复3次= 2012年:是黑色的;是黄色和蓝色,2010 年:是绿色和蓝色,总和 = 3。

期望的结果

# A tibble: 3 x 4
  animals        sentences                                                         year   SUM
  <chr>          <chr>                                                            <dbl> <int>
1 cat; dog; bird the cat is brown; the bird is green and blue                      2010     2
2 dog; bird      the dog is black; the bird is yellow and blue                     2012     3
3 bird           the bird is blue                                                  2001     1

解决方案

我从这里使用了以下代码并添加了一个逻辑运算符: animals[(year>=year-5) & (year<=year)],但它没有给我想要的输出。我究竟做错了什么?

string <- unlist(str_split(df$sentences, ";"))

   df %>% rowwise %>%
      mutate(SUM = str_split(animals[(year>=year-5) & (year<=year)], "; ", simplify = T) %>%
               map( ~ str_count(string, .)) %>%
               unlist %>% sum)

任何帮助将非常感激 :) 。

标签: rdplyrlogical-operators

解决方案


尝试:

library(dplyr)

df %>% 
  mutate(SUM = sapply(strsplit(animals, "; "), length),
         SUM = sapply(year, function(x) sum(SUM[between(year, x - 5 + 1, x)])))

这是输出:

         animals                                                        sentences year SUM
1 cat; dog; bird the cat is brown; the dog is barking; the bird is green and blue 2010   3
2      dog; bird                    the dog is black; the bird is yellow and blue 2018   2
3           bird                                                 the bird is blue 2001   1

当然,2010它与您想要的输出不对应,因为您之前没有提供数据。


推荐阅读