首页 > 解决方案 > R studio:编写一个 for 循环以将自定义函数应用于输入向量,并为该向量中的每个元素输出一个单独的数据帧

问题描述

我有一个数据框,其中包含每个水果类别的几个参数的下限和上限。它看起来像这样:

+----------+-----------+-------+-------+
| Category | Parameter | Upper | Lower |
+----------+-----------+-------+-------+
| Apple    | alpha     | 10    | 20    |
+----------+-----------+-------+-------+
| Apple    | beta      | 20    | 30    |
+----------+-----------+-------+-------+
| Orange   | alpha     | 10    | 20    |
+----------+-----------+-------+-------+
| Orange   | beta      | 30    | 40    |
+----------+-----------+-------+-------+
| Orange   | gamma     | 50    | 60    |
+----------+-----------+-------+-------+
| Pear     | alpha     | 10    | 30    |
+----------+-----------+-------+-------+
| Pear     | beta      | 20    | 40    |
+----------+-----------+-------+-------+
| Pear     | gamma     | 20    | 30    |
+----------+-----------+-------+-------+
| Banana   | alpha     | 40    | 50    |
+----------+-----------+-------+-------+
| Banana   | beta      | 20    | 40    |
+----------+-----------+-------+-------+

我编写了一个函数,在其中传递了这个数据框、水果名称和我的序列所需的长度:

library(purrr)

param_grid <- function(df, fruit, length) {
  df_fruit <- df %>%
    filter(Category == fruit) 
  
  map2(df_fruit$Upper, df_fruit$Lower, seq, length.out = length) %>%
    set_names(df_fruit$Parameter) %>%
    cross_df()
}

输出

param_grid(df, "Apple", length=100)

# A tibble: 10,000 x 2
   alpha  beta
   <dbl> <dbl>
 1  10      20
 2  10.1    20
 3  10.2    20
 4  10.3    20
 5  10.4    20
 6  10.5    20
 7  10.6    20
 8  10.7    20
 9  10.8    20
10  10.9    20
# … with 9,990 more rows

输出

param_grid(df, "Orange", length=100)

# A tibble: 1,000,000 x 3
   alpha  beta gamma
   <dbl> <dbl> <dbl>
 1  10      30    50
 2  10.1    30    50
 3  10.2    30    50
 4  10.3    30    50
 5  10.4    30    50
 6  10.5    30    50
 7  10.6    30    50
 8  10.7    30    50
 9  10.8    30    50
10  10.9    30    50
# … with 999,990 more rows

输出

param_grid(df, "Pear", length=100)

# A tibble: 1,000,000 x 3
   alpha  beta gamma
   <dbl> <dbl> <dbl>
 1  10      20    20
 2  10.2    20    20
 3  10.4    20    20
 4  10.6    20    20
 5  10.8    20    20
 6  11.0    20    20
 7  11.2    20    20
 8  11.4    20    20
 9  11.6    20    20
10  11.8    20    20
# … with 999,990 more rows

现在,我想写一个 for 循环来允许这个函数应用于多个水果:

names <- c("Apple","Orange","Pear")

for (i in names){
  results <- param_grid(df = df, fruit = i, length = 100)
  print(head(results),10)
  }

这工作正常,但它总共返回 3 个数据帧:

    alpha beta
1 20.00000   30
2 19.89899   30
3 19.79798   30
4 19.69697   30
5 19.59596   30
6 19.49495   30
     alpha beta gamma
1 20.00000   40    60
2 19.89899   40    60
3 19.79798   40    60
4 19.69697   40    60
5 19.59596   40    60
6 19.49495   40    60
     alpha beta gamma
1 30.00000   40    30
2 29.79798   40    30
3 29.59596   40    30
4 29.39394   40    30
5 29.19192   40    30
6 28.98990   40    30

有没有办法可以编辑这个 for 循环,以便我可以分别为 Apple、Orange、Pear提供3 个单独的数据框?或者它可能是一个大嵌套数据帧中的每个可调用/子集表的 3 个数据帧(例如 DF[[Apple]]、DF[[Orange]]..)?

非常感谢你的帮助!

标签: rtidyverse

解决方案


我们在一个循环上for循环,只是printing。相反,我们可以存储在list

lst1 <- vector('list', length(names))
names(lst1) <- names
for (i in names){
  results <- param_grid(data=df, fruit = i, length = 100)
  lst1[[i]] <- results
  }

然后,检查list创建的结构

str(lst1)

$我们可以使用或提取单个数据集[[

lst1[[1]]
lst1[[2]]

如果我们想创建不同的对象,其对象名称与“names”向量的元素相同

list2env(lst1, .GlobalEnv)

但是,最好存放在 a 中list并使用它


推荐阅读