首页 > 解决方案 > 将多列相互乘以R中的新数据框

问题描述

我想将许多二进制变量乘以新列,即所谓的交互式变量。我的数据集结构如下:

YearCountry <- data.frame( Time = c("2000","2001", "2002", "2003",
                           "2000","2001", "2002", "2003",
                           "2000","2001", "2002", "2003"),
                  AL = c(1,1,1,1,0,0,0,0,0,0,0,0),
                  FR = c(0,0,0,0,1,1,1,1,0,0,0,0),
                  UK = c(0,0,0,0,0,0,0,0,1,1,1,1),
                  Y2000d = c(1,0,0,0,1,0,0,0,1,0,0,0),
                  Y2001d = c(0,1,0,0,0,1,0,0,0,1,0,0),
                  Y2002d = c(0,0,1,0,0,0,1,0,0,0,1,0),
                  Y2003d = c(0,0,0,1,0,0,0,1,0,0,0,1))
YearCountry

 Time AL FR UK Y2000d Y2001d Y2002d Y2003d
1  2000  1  0  0      1      0      0      0
2  2001  1  0  0      0      1      0      0
3  2002  1  0  0      0      0      1      0
4  2003  1  0  0      0      0      0      1
5  2000  0  1  0      1      0      0      0
6  2001  0  1  0      0      1      0      0
7  2002  0  1  0      0      0      1      0
8  2003  0  1  0      0      0      0      1
9  2000  0  0  1      1      0      0      0
10 2001  0  0  1      0      1      0      0
11 2002  0  0  1      0      0      1      0
12 2003  0  0  1      0      0      0      1

我需要将每个国家(AL、FR、UK)的二进制变量与给定年份的每个二进制变量相乘,以便得到 #country x #year 新变量。在这种情况下,我有 3 个国家和 4 年,这给出了 12 个新变量。我的完整数据包含 105 个国家/地区,时间跨度超过 20 年。因此,我需要一个通用公式。我想要看起来像这样的数据

Interact <- data.frame(Time = c("2000","2001", "2002", "2003",
                                "2000","2001", "2002", "2003",
                                "2000","2001", "2002", "2003"),
                       Y2000xAL = c(1,0,0,0,0,0,0,0,0,0,0,0),
            Y2001xAL = c(0,1,0,0,0,0,0,0,0,0,0,0),
            Y2002xAL = c(0,0,1,0,0,0,0,0,0,0,0,0),
            Y2003xAL = c(0,0,0,1,0,0,0,0,0,0,0,0),
            Y2000xFR = c(0,0,0,0,1,0,0,0,0,0,0,0),
            Y2001xFR = c(0,0,0,0,0,1,0,0,0,0,0,0),
            Y2002xFR = c(0,0,0,0,0,0,1,0,0,0,0,0),
            Y2003xFR = c(0,0,0,0,0,0,0,1,0,0,0,0),
            Y2000xUk = c(0,0,0,0,0,0,0,0,1,0,0,0),
            Y2001xUK = c(0,0,0,0,0,0,0,0,0,1,0,0),
            Y2002xUK = c(0,0,0,0,0,0,0,0,0,0,1,0),
            Y2003xUK = c(0,0,0,0,0,0,0,0,0,0,0,1))
Interact 

 Time Y2000xAL Y2001xAL Y2002xAL Y2003xAL Y2000xFR Y2001xFR Y2002xFR Y2003xFR Y2000xUk Y2001xUK Y2002xUK Y2003xUK
1  2000        1        0        0        0        0        0        0        0        0        0        0        0
2  2001        0        1        0        0        0        0        0        0        0        0        0        0
3  2002        0        0        1        0        0        0        0        0        0        0        0        0
4  2003        0        0        0        1        0        0        0        0        0        0        0        0
5  2000        0        0        0        0        1        0        0        0        0        0        0        0
6  2001        0        0        0        0        0        1        0        0        0        0        0        0
7  2002        0        0        0        0        0        0        1        0        0        0        0        0
8  2003        0        0        0        0        0        0        0        1        0        0        0        0
9  2000        0        0        0        0        0        0        0        0        1        0        0        0
10 2001        0        0        0        0        0        0        0        0        0        1        0        0
11 2002        0        0        0        0        0        0        0        0        0        0        1        0
12 2003        0        0        0        0        0        0        0        0        0        0        0        1

标签: rdataframe

解决方案


这是一种使用dplyr::across. 我们可以将最终结果变成一个普通的 data.frame ,purrr:invokethis answer所示

library(dplyr)
library(purrr)
YearCountry %>% 
    mutate(across(AL:UK, ~ . * select(cur_data(), Y2000d:Y2003d))) %>%
    select(-(Y2000d:Y2003d)) %>% 
    invoke(.f = data.frame) %>%
    rename_with(~str_replace(.,"\\.",""))
   Time ALY2000d ALY2001d ALY2002d ALY2003d FRY2000d FRY2001d FRY2002d FRY2003d UKY2000d UKY2001d UKY2002d UKY2003d
1  2000         1         0         0         0         0         0         0         0         0         0         0         0
2  2001         0         1         0         0         0         0         0         0         0         0         0         0
3  2002         0         0         1         0         0         0         0         0         0         0         0         0
4  2003         0         0         0         1         0         0         0         0         0         0         0         0
5  2000         0         0         0         0         1         0         0         0         0         0         0         0
6  2001         0         0         0         0         0         1         0         0         0         0         0         0
7  2002         0         0         0         0         0         0         1         0         0         0         0         0
8  2003         0         0         0         0         0         0         0         1         0         0         0         0
9  2000         0         0         0         0         0         0         0         0         1         0         0         0
10 2001         0         0         0         0         0         0         0         0         0         1         0         0
11 2002         0         0         0         0         0         0         0         0         0         0         1         0
12 2003         0         0         0         0         0         0         0         0         0         0         0         1

推荐阅读