首页 > 解决方案 > 如何在 R 中使用 dplyr 对基于组的值进行归一化?

问题描述

我有一个数据框测试,想value根据几个组进行标准化,但是,我的代码根据整个样本对值进行了标准化。它只返回value列中的数字 1,但是数字 1 应该超过 1 次,因为我有不同的组。

test<-structure(list(group = c("Total", "Total", "Total", "Total", 
"BMW", "BMW", "Audi", "Audi", "Skoda", "Skoda", "Skoda", "Skoda", 
"Skoda", "Skoda", "Total", "Total", "Total", "Total", "BMW", 
"BMW", "Audi", "Audi", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda", 
"Skoda", "Total", "Total", "Total", "Total", "BMW", "BMW", "Audi", 
"Audi", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda", "Skoda", 
"Total", "Total", "Total", "Total", "BMW", "BMW", "Audi", "Audi"
), day = c("MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", 
"MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", "MONDAY", 
"MONDAY", "MONDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", 
"TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", 
"TUESDAY", "TUESDAY", "TUESDAY", "TUESDAY", "WEDNESDAY", "WEDNESDAY", 
"WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", 
"WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", "WEDNESDAY", 
"WEDNESDAY", "WEDNESDAY", "THURSDAY", "THURSDAY", "THURSDAY", 
"THURSDAY", "THURSDAY", "THURSDAY", "THURSDAY", "THURSDAY"), 
    variable = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), .Label = c("ALL", 
    "DE", "BE", "NL", "AUS", "ES", "IT", "FR", "PO"), class = "factor"), 
    value = c(2400000, 384000, 43826, 340174, 960000, 230400, 
    1440000, 153600, 1000, 700, 23500, 11000, 12500, 10000, 3750000, 
    185625, 85450, 100175, 2437500, 146250, 1312500, 39375, 1650, 
    1240, 35700, 29000, 6700, 6250, 3400000, 183600, 23868, 159732, 
    2210000, 88400, 1190000, 95200, 1400, 1100, 33000, 27500, 
    5500, 4500, 3100000, 186000, 18600, 167400, 1860000, 148800, 
    1240000, 37200)), row.names = c(NA, 50L), class = "data.frame")

normalit<-function(m){
  (m - min(m))/(max(m)-min(m))
}
df_scaled <- test %>% group_by(group,variable) %>% mutate(value = normalit(value))

标签: r

解决方案


您的代码从您所说的内容中可以正常工作。我那里有多个1。也许您没有看到所有的行?

print.data.frame( df_scaled )
   group       day variable       value
1  Total    MONDAY      ALL 0.638205499
2  Total    MONDAY      ALL 0.097925712
3  Total    MONDAY      ALL 0.006760465
4  Total    MONDAY      ALL 0.086180522
5    BMW    MONDAY      ALL 0.371035716
6    BMW    MONDAY      ALL 0.060448682
7   Audi    MONDAY      ALL 1.000000000 <!-- here
8   Audi    MONDAY      ALL 0.082976903
9  Skoda    MONDAY      ALL 0.008571429
10 Skoda    MONDAY      ALL 0.000000000
11 Skoda    MONDAY      ALL 0.651428571
12 Skoda    MONDAY      ALL 0.294285714
13 Skoda    MONDAY      ALL 0.337142857
14 Skoda    MONDAY      ALL 0.265714286
15 Total   TUESDAY      ALL 1.000000000 <!-- here
16 Total   TUESDAY      ALL 0.044762020
17 Total   TUESDAY      ALL 0.017915528
18 Total   TUESDAY      ALL 0.021861768
19   BMW   TUESDAY      ALL 1.000000000 <!-- here
20   BMW   TUESDAY      ALL 0.024626453
21  Audi   TUESDAY      ALL 0.909110351
22  Audi   TUESDAY      ALL 0.001550470
23 Skoda   TUESDAY      ALL 0.027142857
24 Skoda   TUESDAY      ALL 0.015428571
25 Skoda   TUESDAY      ALL 1.000000000 <!-- here
26 Skoda   TUESDAY      ALL 0.808571429
27 Skoda   TUESDAY      ALL 0.171428571
28 Skoda   TUESDAY      ALL 0.158571429
29 Total WEDNESDAY      ALL 0.906201426
30 Total WEDNESDAY      ALL 0.044219328
31 Total WEDNESDAY      ALL 0.001411803
32 Total WEDNESDAY      ALL 0.037822801
33   BMW WEDNESDAY      ALL 0.903154400
34   BMW WEDNESDAY      ALL 0.000000000
35  Audi WEDNESDAY      ALL 0.821785001
36  Audi WEDNESDAY      ALL 0.041345880
37 Skoda WEDNESDAY      ALL 0.020000000
38 Skoda WEDNESDAY      ALL 0.011428571
39 Skoda WEDNESDAY      ALL 0.922857143
40 Skoda WEDNESDAY      ALL 0.765714286
41 Skoda WEDNESDAY      ALL 0.137142857
42 Skoda WEDNESDAY      ALL 0.108571429
43 Total  THURSDAY      ALL 0.825802648
44 Total  THURSDAY      ALL 0.044862518
45 Total  THURSDAY      ALL 0.000000000
46 Total  THURSDAY      ALL 0.039877794
47   BMW  THURSDAY      ALL 0.754161168
48   BMW  THURSDAY      ALL 0.025711975
49  Audi  THURSDAY      ALL 0.857428001
50  Audi  THURSDAY      ALL 0.000000000

推荐阅读