首页 > 解决方案 > R - Apply function on multiple data frames

问题描述

I loaded several data sheets as data frames in R with:

temp = list.files(pattern="*.csv")
for (i in 1:length(temp)) assign(temp[i], read.csv(temp[i]))

Now I would like to apply a function on all data frames. I thought about something like:

kappa1_mean_h_stem <- lapply(df.list, mean_h_stem)

Where df.list contains a list of all data frames.

    mean_h_stem <- function(x) {
  mean(x[1,3])
}

I want the function to return the mean for a specific column. But it tells me, I had the wrong number of dimensions.

标签: rfunctionapplydimensions

解决方案


The reason for your error is I think that you passed x[1,3] which would get the value from the first row of the third column only. I assume you want to calculate the mean of the same column across all the data.frames, so I made a slight modification to your function so you can pass data and the name or position of the column:

mean_h_stem <- function(dat, col){ mean(dat[,col], na.rm=T)}

Column can be selected using an integer:

lapply(df.list, mean_h_stem, 2)

Or a column name, expressed as a string:

lapply(df.list, mean_h_stem, 'col_name')

Passing the second argument like this can feel a little unintuitive, so you can do it in a clearer way:

lapply(df.list, function(x) mean_h_stem(dat = x, col ='col_name'))

This will only work for single columns at a time per your question, but you could easily modify this to do multiple.

As an aside, to read in the csv files, you could also use an lapply with read.csv:

temp <- list.files(pattern='*.csv')
df.list <- lapply(temp, read.csv)

推荐阅读