首页 > 解决方案 > 分组、汇总和转置

问题描述

我有一个看起来像这样的数据框:

ctgroup (dataframe)

Camera Trap Name  Animal Name         a_sum 
 1  CAM27             Chicken             1
 2  CAM27             Dog                 1
 3  CAM27             Dog                 4
 4  CAM28             Cat                 3
 5  CAM28             Dog                 22
 6  CAM28             Dog                 1

*a_sum = 摄像机记录的动物数量

所以基本上我想 - 按 2 个字段(相机陷阱名称,科学名称)分组,然后计算“a_sum”列中的记录数,然后转置数据以便 Animal. 名称变为列,相机陷阱命名我的行。我想在列中显示所有动物名称,如果没有可用数据,则为 0,即,

Camera trap name        Dog   Cat   Wolf   Chicken
   CAM28                 23     4     1      4
   CAM27                 5      0     0      4

我尝试使用以下代码

dcast (ctgroup, Camera.Trap.name + Animal.name, value.var  = "a_sum")

我收到以下错误:

In dcast(ctgroup, Camera.Trap.name + Scientific.name, value.var = "a_sum") :
  The dcast generic in data.table has been passed a grouped_df and will attempt to redirect to the reshape2::dcast; please note that reshape2 is deprecated, and this redirection is now deprecated as well. Please do this redirection yourself like reshape2::dcast(ctgroup). In the next version, this warning will become an error.

我认为我知道的不够多,无法构建正确的代码来执行这项工作。

标签: rgroup-bysummarize

解决方案


使用 data.table ...

# Load data.table.
require(data.table)

# Create data.set.
df <- data.frame(Camera = c("CAM27", "CAM27", "CAM27", "CAM28", "CAM28", "CAM28"),
Animal = c("Chicken", "Dog", "Dog", "Cat", "Dog", "Dog"),
a_sum = c(1, 1, 4, 3, 22, 1))

# Set the data.frame as a data.table.
setDT(df)

# Cast by `Camera` and `Animal` and sum `a_sum`.
dcast(df, Camera ~ Animal, value.var = "a_sum", fun.aggregate = sum)
#    Camera Cat Chicken Dog
# 1:  CAM27   0       1   5
# 2:  CAM28   3       0  23

# If you want to coerce back to a data.frame.
setDF(df)



推荐阅读