首页 > 解决方案 > 试图仅基于性别来创建流失率的比例。可以获得成员总数的比例

问题描述

1.读取文件

library(tidyverse)
churnData <- as_tibble(read.table("WA_Fn-UseC_-Telco-Customer-Churn.csv",
             sep=",",header=TRUE,stringsAsFactors=FALSE))

2. 搅动和不搅动的情况有多少?

 churnData %>%
   group_by(Churn) %>% 
   summarise(Count=n())
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 2
  Churn Count
  <chr> <int>
1 No     5174
2 Yes    1869

3.确定档案中女性和男性的数量和比例

 churnData %>%
   group_by(gender) %>%
   summarise(Count=n(),Proportion=Count/nrow(churnData))
 
`summarise()` ungrouping output (override with `.groups` argument)
# A tibble: 2 x 3
  gender Count Proportion
  <chr>  <int>      <dbl>
1 Female  3488      0.495
2 Male    3555      0.505

4. 考虑到客户的性别,流失的可能性有多大?

   churnData   %>%
   group_by(gender,Churn)   %>%
   summarise(Count=n(),Proportion=Count/nrow(churnData))

`summarise()` regrouping output by 'gender' (override with `.groups` argument)
# A tibble: 4 x 4
# Groups:   gender [2]
  gender Churn Count Proportion
  <chr>  <chr> <int>      <dbl>
1 Female No     2549      0.362
2 Female Yes     939      0.133
3 Male   No     2625      0.373
4 Male   Yes     930      0.132
>     summarise(Count=n(),Proportion=Count/nrow[churnData$gender==gender])
Error: `n()` must only be used inside dplyr verbs.
Run `rlang::last_error()` to see where the error occurred.

需要的比例是:

0.731
0.269
0.738
0.262

标签: r

解决方案


推荐阅读