首页 > 解决方案 > How to group duplicate values to a single value and pull associated values with that column value in R?

问题描述

I have a dataframe that looks like so:


df <- data.frame(
  Location = c("buildinga", "buildinga", "buildinga", "buildingb", "buildingb", "buildingb", "buildingc", "buildingc", "buildingc),
  Category   = c(candy, candy, snacks, candy, snacks, soda, soda, candy, soda)
  Calories   = 200, 250, 150, 180, 200, 80, 140, 200, 210)
)

I want to group 'Location' by just a single building and pull corresponding values for each location (so building a, b and c have total calories for candy, snacks, and soda).

I tried doing group_by(location) %>% summarize(count(n=()) but that still gave me each location. I want to remove duplicates for location but not for 'Category' or 'Calories'.

标签: rdataframegroup-bysubset

解决方案


Using dplyr, you can group_by your data and count calories in each category.

library(dplyr)
df %>%
  group_by(Location, Category) %>%
  summarise(Count = sum(Calories))

# A tibble: 7 x 3
# Groups:   Location [3]
  Location  Category Count
  <fct>     <fct>    <dbl>
1 buildinga candy      450
2 buildinga snacks     150
3 buildingb candy      180
4 buildingb snacks     200
5 buildingb soda        80
6 buildingc candy      200
7 buildingc soda       350

Is it what you are looking for ?

Data

Your data examples has some typo issues, here is the one that I used:

df <- data.frame(
  Location = c("buildinga", "buildinga", "buildinga", "buildingb", "buildingb", "buildingb", "buildingc", "buildingc", "buildingc"),
  Category   = c("candy", "candy", "snacks", "candy", "snacks", "soda", "soda", "candy", "soda"),
  Calories   = c(200, 250, 150, 180, 200, 80, 140, 200, 210)
)

推荐阅读