首页 > 解决方案 > r或python pandas中data.frame中的顺序减法

问题描述

我正在做一个项目,我根据等级在设施之间分配产品。数据如下:

> product_small

  facility_abbr itemid need_qty rank available_qty
7            NORW 000643        8    1            40
8            CARN 000643       16    2            40
9            NVMC 000643       24    3            40
10           SEBT 000643       24    3            40
11           SNEC 000643       32    5            40
12           SEMC 000643       96    6            40
13           STAN 000643      784    7            40
130          HFAD 034199       35    1             8
131          EAST 034199       40    2             8
132          NVMC 034199      110    3             8
133          HFHH 034199      113    4             8
134          CARN 034199      182    5             8

这里有两种产品:000634 和 034199,每种产品的可用数量分别为 40 和 8。我希望计算每个设施的分配数量并跟踪剩余的数量,结果应该是这样的

     facility_abbr itemid need_qty rank available_qty allocated leftover
7            NORW 000643        8    1            40     8      0
8            CARN 000643       16    2            40    16      0
9            NVMC 000643       24    3            40    16      0
10           SEBT 000643       24    3            40     0      0
11           SNEC 000643       32    5            40     0      0
12           SEMC 000643       96    6            40     0      0
13           STAN 000643      784    7            40     0      0
130          HFAD 034199       35    1             8     8      0
131          EAST 034199       40    2             8     0      0
132          NVMC 034199      110    3             8     0      0
133          HFHH 034199      113    4             8     0      0
134          CARN 034199      182    5             8     0      0

在 R 或 Python 熊猫中解决这个问题的最佳方法是什么?我在写循环时卡住了……非常感谢!

标签: pythonrpandas

解决方案


这是dplyr避免循环的解决方案:

df = read.table(text = "
facility_abbr itemid need_qty rank available_qty
7            NORW 000643        8    1            40
8            CARN 000643       16    2            40
9            NVMC 000643       24    3            40
10           SEBT 000643       24    3            40
11           SNEC 000643       32    5            40
12           SEMC 000643       96    6            40
13           STAN 000643      784    7            40
130          HFAD 034199       35    1             8
131          EAST 034199       40    2             8
132          NVMC 034199      110    3             8
133          HFHH 034199      113    4             8
134          CARN 034199      182    5             8
", header=T)

library(dplyr)

df %>%
  group_by(itemid) %>%
  mutate(leftover = available_qty - cumsum(need_qty),
         allocated = case_when(leftover < 0 & row_number() == 1 ~ available_qty,
                               leftover < 0 & row_number() > 1 ~ lag(leftover),
                               TRUE ~ need_qty),
         allocated = ifelse(allocated < 0, 0, allocated),
         leftover = ifelse(leftover < 0, 0, leftover)) %>%
  ungroup()

# # A tibble: 12 x 7
#   facility_abbr itemid need_qty  rank available_qty leftover allocated
#   <fct>          <int>    <int> <int>         <int>    <dbl>     <dbl>
# 1 NORW             643        8     1            40       32         8
# 2 CARN             643       16     2            40       16        16
# 3 NVMC             643       24     3            40        0        16
# 4 SEBT             643       24     3            40        0         0
# 5 SNEC             643       32     5            40        0         0
# 6 SEMC             643       96     6            40        0         0
# 7 STAN             643      784     7            40        0         0
# 8 HFAD           34199       35     1             8        0         8
# 9 EAST           34199       40     2             8        0         0
#10 NVMC           34199      110     3             8        0         0
#11 HFHH           34199      113     4             8        0         0
#12 CARN           34199      182     5             8        0         0

推荐阅读