首页 > 解决方案 > 具有条件的不同组的条形图

问题描述

我正在从事一个数据科学项目,我的目标是从庞大的数据集中提供一个摘要。

例如,我想知道有多少客户订购了 House Brand 一次、两次、两次以上。有多少人订购了自有品牌和非自有品牌?有多少人只订购了 nonHouse 品牌?

我怎样才能做到这一点?

样本数据集

PRODUCT_SUB_LINE_DESCR    MAJOR_CATEGORY_DESCR       CUST_REGION_DESCR
SUNDRY                         SMALL EQUIP           NORTH EAST REGION
SUNDRY                         SMALL EQUIP           SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           NORTH EAST REGION
SUNDRY                         PREVENTIVE            SOUTH CENTRAL REGION
SUNDRY                         PREVENTIVE            SOUTH EAST REGION
SUNDRY                         PREVENTIVE            SOUTH EAST REGION
SUNDRY                         SMALL EQUIP           NORTH CENTRAL REGION
SUNDRY                         SMALL EQUIP           MOUNTAIN WEST REGION
SUNDRY                         SMALL EQUIP           MOUNTAIN WEST REGION
SUNDRY                         COMPOSITE             NORTH CENTRAL REGION
SUNDRY                         COMPOSITE             NORTH CENTRAL REGION
SUNDRY                         COMPOSITE             OHIO VALLEY REGION
SUNDRY                         COMPOSITE             NORTH EAST REGION

Sales   QtySold      MFGCOST    MarginDollars   new_ProductName
209.97  3             134.55    72.72            no
-76.15  -1            -44.85    -30.4            no
275.6   2             162.5     109.84           no
138.7   1             81.25     55.82            no
226     2             136       87.28            no
115     1             68        45.64            no
210.7   2             136       71.98            no
29      1             18.85     9.77             no
29      1             18.85     9.77             no
46.32   2             37.7      7.86             no
159.86  1             132.4     24.81            no
441.3   2             264.8     171.2            no
209.62  1             132.4     74.57            no
209.62  1             132.4     74.57            no

这不是原始数据集。我基本上在我的原始数据集中添加了一个新列,用于稍后的决策树分析。但现在,我想在这里制作一些情节。自有品牌被认为是自有品牌。

 new_ProductName = ifelse( PRODUCT_SUB_LINE_DESCR == "PRIVATE 
                          LABEL","yes","no")
 data = data.frame(new_Dataset, new_ProductName)

问题:

> group_by_region = data %>% group_by(PRODUCT_SUB_LINE_DESCR, 
      CUST_REGION_DESCR) %>% summarise(count=n(), sales=sum(Sales))

> mytable = table(group_by_region)
> barplot(mytable)
Error in barplot.default(mytable) : 'height' must be a vector or a matrix

标签: r

解决方案


推荐阅读