首页 > 解决方案 > lapply() 输出为具有多个函数的多个值的数据帧 - R

问题描述

我有这个摘要数据框(来自这个问题):

lst <- lapply(1:ncol(mtcars), function(i){
  x <- mtcars[[i]]
  data.frame(
    Variable_name = colnames(mtcars)[[i]],
    sum_unique = NROW(unique(x)), 
    NA_count = sum(is.na(x)), 
    NA_percent = round(sum(is.na(x))/NROW(x),2))  
  })
do.call(rbind, lst)

我想为每一列添加五个最高和最低值:

lst <- lapply(1:ncol(mtcars), function(i){
  x <- mtcars[[i]]
  data.frame(
    variable_name = colnames(mtcars)[[i]],
    distinct = NROW(unique(x)), 
    NA_count = sum(is.na(x)), 
    NA_percent = round(sum(is.na(x))/NROW(x),2),
    first_5 = paste0(sort(x, decreasing=TRUE)[1:5],";"),
    last_5 = paste0(sort(x)[1:5],";")
  )   
})
do.call(rbind, lst)

但它为每个first_5last_5值创建一个新行。为什么会这样?我该如何解决?

标签: rdplyrtidyverselapply

解决方案


你快到了。因为你有五个数字一个位置,paste0它本身不能胜任这项工作。一种解决方案是toString像这样添加:

lst <- lapply(1:ncol(mtcars), function(i){
  x <- mtcars[[i]]
  data.frame(
    variable_name = colnames(mtcars)[[i]],
    distinct = NROW(unique(x)), 
    NA_count = sum(is.na(x)), 
    NA_percent = round(sum(is.na(x))/NROW(x),2),
    first_5 = paste0(toString(sort(x, decreasing=TRUE)[1:5]),";"),
    last_5 = paste0(toString(sort(x)[1:5]),";")
  )   
})
do.call(rbind, lst)

推荐阅读