r - 有条件地添加几个新列的更好方法,填充依赖于其他列条目的条目
问题描述
我有以下数据框:
CustomerID Department Price SportswearDemand HomeDemand KidswearDemand WomenswearDemand
-------------------------------------------------------------------------------------------
1050091 Sportswear 497.6 0 0 0 0
1555018 Womenswear 336.0 0 0 0 0
210239 Womenswear 698.0 0 0 0 0
507556 Sportswear 209.0 0 0 0 0
1708193 Sportswear 209.0 0 0 0 0
1295733 Menswear 209.0 0 0 0 0
1213373 Sportswear 298.0 0 0 0 0
753471 Sportswear 209.0 0 0 0 0
82739 Menswear 349.0 0 0 0 0
1660995 Kidswear 424.6 0 0 0 0
.
.
.
SportswearDemand
从现在开始,包括和右侧在内的所有列都称为“需求列”。我想根据以下信息填充这些Department
信息Price
:
如果某个customerID
部门包含条目Sportswear
,那么我希望将该行的价格输入到SportswearDemand
而不是当前的零。其他需求列也是如此。最终结果应如下所示:
CustomerID Department Price SportswearDemand HomeDemand KidswearDemand WomenswearDemand
-------------------------------------------------------------------------------------------
1050091 Sportswear 497.6 497.6 0 0 0
1555018 Womenswear 336.0 0 0 0 336.0
210239 Womenswear 698.0 0 0 0 698.0
507556 Sportswear 209.0 209.0 0 0 0
1708193 Sportswear 209.0 209.0 0 0 0
1295733 Menswear 209.0 0 0 0 0
1213373 Sportswear 298.0 298.0 0 0 0
753471 Sportswear 209.0 209.0 0 0 0
82739 Menswear 349.0 0 0 0 0
1660995 Kidswear 424.6 0 0 424.6 0
.
.
.
我设法像这样解决它:
df$SportswearDemand <- with(df, ifelse(df$Department == "Sportswear", df$Price, 0))
df$HomeDemand <- with(df, ifelse(df$Department == "Home", df$Price, 0))
df$KidswearDemand <- with(df, ifelse(df$Department == "Kidswear", df$Price, 0))
df$WomenswearDemand <- with(df, ifelse(df$Department == "Womenswear", df$Price, 0))
但是,我还有 30 多个这样的需求列,我想知道是否有比这样硬编码 30 行更好的方法?
我的第一个想法是将一行封装在一个 for 循环中,如下所示:
DemandColumns # array of all the 30 different demand columns stored as strings
for (i in DemandColumns){
df$i <- with(df, ifelse(df$Department == substr(i,1,nchar(i)-6), df$Price, 0))
}
但它只是添加了一个"i"
用零填充的列。substr
用于获取除字符串之外的所有字符"Demand"
。任何帮助表示赞赏。
解决方案
无需初始化“需求列”,先将其删除。
df[grep('Demand', names(df))] <- NULL
Price
创建和列的副本Department
并获取宽格式数据。
library(dplyr)
library(tidyr)
df %>%
mutate(value = Price,
name = Department) %>%
pivot_wider(names_from = name, values_from = value,
names_glue = '{name}_Demand', values_fill = 0)
# CustomerID Department Price Sportswear_Demand Womenswear_Demand Menswear_Demand Kidswear_Demand
# <int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1050091 Sportswear 498. 498. 0 0 0
# 2 1555018 Womenswear 336 0 336 0 0
# 3 210239 Womenswear 698 0 698 0 0
# 4 507556 Sportswear 209 209 0 0 0
# 5 1708193 Sportswear 209 209 0 0 0
# 6 1295733 Menswear 209 0 0 209 0
# 7 1213373 Sportswear 298 298 0 0 0
# 8 753471 Sportswear 209 209 0 0 0
# 9 82739 Menswear 349 0 0 349 0
#10 1660995 Kidswear 425. 0 0 0 425.
推荐阅读
- python - 熊猫合并两个数据框求和值
- javascript - 每次用户返回主页时,如何停止播放背景视频?
- spring - 我们是否应该在类上使用@Component 注释来记录/指示它是一个 Bean?
- javascript - 数组中的展开操作出错。TS1005:“,”预期。打字稿
- api - 唤醒时 Azure 应用服务的瓶颈
- testing - Is there a way to fake DateTime.now() in a Flutter test?
- reactjs - redux 状态改变但连接的组件没有更新,无法理解突变
- git - Git Remote Repo 显示冲突,但本地没有显示冲突
- python - Nested ndarray reshape
- ruby - 无法让 ruby 使用 Nokogiri 解析我的 XML