r - 如何在 r 中使用 group_by 通过某些列组合数据帧的行，但同时保留其他列

问题描述

这应该很简单，我只是无法让它工作

我有一个all_emissions_state_total看起来像这样的数据框：

tribe    state      scc       pollutant      emissions     unit     category    eis     year     fraction 
NA       WY         707       Methane        546           TON      onroad      NA      2011     NA
NA       WY         707       Methane        38            TON      onroad      NA      2011     NA
NA       WY         3405      Methane        2937          TON      onroad      NA      2011     NA
NA       MT         707       Methane        665           TON      onroad      NA      2011     NA
NA       WY         390       CO2            740           TON      onroad      NA      2011     NA
NA       MT         390       CO2            12            TON      onroad      NA      2011     NA
NA       WY         3405      Methane        329           TON      onroad      NA      2011     NA
GHYU     WY         390       CO2            44            TON      point       NA      2011     NA
BERS     WY         390       CO2            64445         TON      point       NA      2011     596
SDSH     KS         707       Methane        123           TON      point       NA      2011     3890
SDSH     MT         707       Methane        58            TON      point       NA      2011     112

我希望它看起来像这样：

state       scc        pollutant        emissions        unit        year
WY          707        Methane          584              TON         2011
MT          707        Methane          723              TON         2011
WY          3405       Methane          3266             TON         2011
WY          390        CO2              65229            TON         2011
MT          390        CO2              12               TON         2011
KS          707        Methane          123              TON         2011

在原始数据帧all_emissions_state_total, tribe, state, scc, pollutant, emissions, category, eis, 和fraction变化。unit永远是 TON，year永远是 2011 年。

我希望这些行按具有相同state、scc和的行进行分组pollutant，并且该emissions列是被分组的那些行的总和。tribe, category, eis, andfraction没关系，可以去掉，但是unitandyear需要留下。

这是我认为可行的：

all_emissions_state <- all_emissions_state_total %>%
                                group_by( state, scc, pollutant ) %>% 
                                summarise( emissions = sum( emissions ) )

但我对此的输出是一个 1x1 数据帧all_emissions_state，它具有列emissions和 1 个值，该值是数据帧中所有排放的总和。

标签： rgroup-bydplyr

New_df <- do.call(rbind,lapply(split(df, with(df,paste0(state,scc,pollutant))), function(x) x[1,c("state","scc","pollutant","emissions","unit","year")])) New_df$emissions <- sapply( split(df$emissions, with(df,paste0(state,scc,pollutant))), sum) row.names(New_df) <- NULL > New_df state scc pollutant emissions unit year 1 KS 707 Methane 123 TON 2011 2 MT 390 CO2 12 TON 2011 3 MT 707 Methane 723 TON 2011 4 WY 3405 Methane 3266 TON 2011 5 WY 390 CO2 65229 TON 2011 6 WY 707 Methane 584 TON 2011

r - 如何在 r 中使用 group_by 通过某些列组合数据帧的行，但同时保留其他列

问题描述

解决方案

推荐阅读