首页 > 解决方案 > %in%/match 和由数值确定的逻辑索引的奇怪行为

问题描述

> str(values)
'data.frame':   121 obs. of  10 variables:
 $ months  : num  0 1 2 3 4 5 6 7 8 9 ...
 $ Estimate: num  1 0.987 0.955 0.951 0.951 ...
 $ Lower   : num  1 0.972 0.929 0.923 0.923 ...
 $ Upper   : num  1 1 0.983 0.98 0.98 ...
 $ Estimate: num  1 0.982 0.954 0.936 0.917 ...
 $ Lower   : num  1 0.964 0.927 0.904 0.882 ...
 $ Upper   : num  1 1 0.982 0.969 0.955 ...
 $ Estimate: num  1 0.987 0.96 0.955 0.955 ...
 $ Lower   : num  1 0.972 0.934 0.929 0.929 ...
 $ Upper   : num  1 1 0.986 0.983 0.983 ...

> str(values$months)
 num [1:121] 0 1 2 3 4 5 6 7 8 9 ...

> values$months
  [1]   0   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17
 [19]  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35
 [37]  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53
 [55]  54  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71
 [73]  72  73  74  75  76  77  78  79  80  81  82  83  84  85  86  87  88  89
 [91]  90  91  92  93  94  95  96  97  98  99 100 101 102 103 104 105 106 107
[109] 108 109 110 111 112 113 114 115 116 117 118 119 120

任何提示为什么会发生这种情况?我完全没有头绪。任何人都可以重现这个吗?

> dput(values$months)
c(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 
18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 
34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 
50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 
66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 
82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 
98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 
111, 112, 113, 114, 115, 116, 117, 118, 119, 120)

> seq(from = 12, to = 120, by = 12)
 [1]  12  24  36  48  60  72  84  96 108 120

> values$months %in% seq(from = 12, to = 120, by = 12)
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [61]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [73]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [85]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[121] FALSE

> which(values$months %in% seq(from = 12, to = 120, by = 12))
[1]  37  61  73  85 109

> as.integer(values$months) %in% as.integer(seq(from = 12, to = 120, by = 12))
  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [13] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [25] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [37]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [49] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [61]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [73]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [85]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
 [97] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[109]  TRUE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
[121] FALSE

> which(as.integer(values$months) %in% as.integer(seq(from = 12, to = 120, by = 12)))
[1]  37  61  73  85 109


标签: r

解决方案


好的,似乎有些行values$months的位数最少。用于as.integer这种情况不是正确的解决方案,因为:

> as.integer(2.99999999)
[1] 2

正确的方法是使用round

> cbind(values$months, as.integer(values$months), round(values$months))
       [,1] [,2] [,3]
  [1,]    0    0    0
  [2,]    1    1    1
  [3,]    2    2    2
  [4,]    3    2    3
  [5,]    4    4    4
  [6,]    5    5    5
  [7,]    6    5    6
  [8,]    7    7    7
  [9,]    8    8    8
 [10,]    9    9    9
 [11,]   10   10   10
 [12,]   11   11   11
 [13,]   12   11   12       # <- difference here
 [14,]   13   13   13
 
....

> seq(12, 120, by = 12) %in% round(values$months)
 [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE


推荐阅读