首页 > 解决方案 > 如何组合通过另一列中的因子名称访问它们的数据框的不同行?

问题描述

我有一个包含两个变量的数据框,第一个变量是因子,第二个是数值。我想根据该行中因子的名称对几行求和。

我希望将总和的结果存储在开始时具有较高值的​​行中,并删除已添加的其余行。数据框是有序的。我怎么能得到这样的东西?

例如:

I have:                         I want:

Things     Numbers             Things     Numbers

Bottle       35                 Bottle       35
Pencil       27                 Paper        32
Paper        24                 Pencil       29
Pen          13                 Pen          13
Phone        10                 Phone        10
Apple         9                 Apple        9
Chair         7
Bus           2
Kitchen       1
Paper=Paper+Chair+Kitchen
Pencil=Pencil+bus

标签: rdataframe

解决方案


library(tidyverse)

df <- tibble::tribble(
    ~Things, ~Numbers,
   "Bottle",       35,
   "Pencil",       27,
    "Paper",       24,
      "Pen",       13,
    "Phone",       10,
    "Apple",        9,
    "Chair",        7,
      "Bus",        2,
  "Kitchen",        1
  )

df %>%

  # might have to convert to character first...
  # mutate(Things = as.character(Things)) %>%

  mutate(
    Things = case_when(
      Things %in% c("Paper", "Chair", "Kitchen") ~ "Paper",
      Things %in% c("Pencil", "Bus") ~ "Pencil",
      TRUE ~ Things
    )
  ) %>%
  group_by(Things) %>%
  summarise(
    Numbers = sum(Numbers)
  ) %>%
  arrange(desc(Numbers))
#> # A tibble: 6 x 2
#>   Things Numbers
#>   <chr>    <dbl>
#> 1 Bottle      35
#> 2 Paper       32
#> 3 Pencil      29
#> 4 Pen         13
#> 5 Phone       10
#> 6 Apple        9

reprex 包(v0.2.1)于 2019 年 5 月 21 日创建


推荐阅读