首页 > 解决方案 > 找到三个在特征方面最接近的

问题描述

library("tidyverse")
demo1 <- as.data.frame(tribble(
  ~cut,         ~freq, ~distance,
  "Fair",       1610,    4,
  "Good",       4906,    100,
  "Very Good",  12082,   45,
  "Premium",    13791,    50,
  "Ideal",      21551,  34,
  "Very good",  14938,  60
))

demo2 <- as.data.frame(tribble(
  ~cut,         ~freq, ~distance,
  "Very Good",  403,    14,
  "Fair",       3920,    25,
  "Premium",  5938,   5,
  "Good",    4593,    40,
  "Ideal",      21551,   2.4,
))

对于 demo1 数据中的每个观察值(行),我想在与 demo2 数据中的行相关的距离方面找到最接近的三个。结果需要合并到同一个 .csv 文件中。

我实现的代码显示错误。

Result<-data.frame()
for (k in 1:nrow(demo1)) {
  Result <- demo2 %>% 
    filter(abs(demo2$distance-demo1$distance[k])????)
  print(Result)
  }

谁能帮我解决这个问题,谢谢!

标签: rloopsapplytidyversematching

解决方案


也许你可以试试下面的代码

out <- bind_cols(
  demo2,
  tibble(nearest = lapply(
    split(
      abs(do.call("-", expand.grid(demo1$distance, demo2$distance))),
      ceiling(seq(nrow(demo1) * nrow(demo2)) / nrow(demo1))
    ),
    function(x) demo1[head(order(x), 3), ]
  ))
)

这使

> out
# A tibble: 5 x 4
  cut        freq distance nearest
  <chr>     <dbl>    <dbl> <named list>
1 Very Good   403     14   <tibble [3 x 3]>
2 Fair       3920     25   <tibble [3 x 3]>
3 Premium    5938      5   <tibble [3 x 3]>
4 Good       4593     40   <tibble [3 x 3]>
5 Ideal     21551      2.4 <tibble [3 x 3]>

在哪里

> out$nearest
$`1`
        cut  freq distance
1      Fair  1610        4
5     Ideal 21551       34
3 Very Good 12082       45

$`2`
        cut  freq distance
5     Ideal 21551       34
3 Very Good 12082       45
1      Fair  1610        4

$`3`
        cut  freq distance
1      Fair  1610        4
5     Ideal 21551       34
3 Very Good 12082       45

$`4`
        cut  freq distance
3 Very Good 12082       45
5     Ideal 21551       34
4   Premium 13791       50

$`5`
        cut  freq distance
1      Fair  1610        4
5     Ideal 21551       34
3 Very Good 12082       45

推荐阅读