首页 > 解决方案 > 当值有关联时,选择一个特定的值,考虑其在另一列中的值

问题描述

我正在与 R 工作室合作。

我有一个看起来像这样的数据集......

Condition  TargetWord             WordProduced        WPcondition    realValue
1          Target1                  table                 P              .009
1          Target1                  word                  P              .025
1          Target1                  chair                 P              .005
1          Target1                  pole                  Q              .015
1          Target1                  skate                 Q              .023
1          Target2                  car                   Q              .014
1          Target2                  house                 P              .014
1          Target2                  shoes                 P              .019
1          Target2                  girl                  Q              .011
1          Target2                  life                  Q              .020
1          Target3                  computer              Q              .007
1          Target3                  ball                  Q              .007
1          Target3                  court                 P              .009
1          Target3                  plane                 Q              .035
1          Target3                  sky                   O              .008
2          Target4                  tree                  P              .051
2          Target4                  five                  P              .051
2          Target4                  help                  Q              .003
2          Target4                  shave                 Q              .006
2          Target4                  love                  P              .028
2          Target5                  three                 P              .056
2          Target5                  file                  Q              .056
2          Target5                  hemp                  P              .003
2          Target5                  share                 P              .006
2          Target5                  long                  Q              .028
2          Target6                  ten                   Q              .058
2          Target6                  friend                P              .051
2          Target6                  hail                  Q              .003
2          Target6                  shine                 P              .006
2          Target6                  loner                 P              .028

所以,每个目标都重复了五次,我需要过滤第一次。我遇到的问题是,如果前两个位置的 reaValue 相同(.014 和 .014),我需要一个在 WPcondition 下具有 P 值的位置。

也就是说,在过滤第一个位置之前,如果我在前两个位置中有一个 realValue,那么我需要查看左侧的列 (WPcondition) 以查看其中一个是否是 P。如果其中一个他们是P,那么我需要把那个放在第一个位置。

比方说...

1position  P   .05
2position  P   .05
(stay with the one that it is in the first position) 

1position  Q   .05
2position  P   .05
(Use the one that it is in the second position because it has a P)

1position  Q   .05
2position  Q   .05
(stay with the one that it is in the first position)

1position  P   .05
2position  Q   .05
(stay with the one that it is in the first position)

1position  P   .06
2position  Q   .05
(stay with the one that it is in the first position because the realValue is higher)

1position  Q   .06
2position  P   .05
(stay with the one that it is in the first position because the realValue is higher)

所以,我需要保留价值更高的那个,但是如果值相同,我们需要考虑 P&Q 值,如果有 P,则选择那个。

考虑到上述数据,我希望是这样的......

Condition   TargetWord     WordProduced   WPcondition    realValue
  1          Target1           word            P            .025
  1          Target2           house           P            .014
  1          Target3           computer        Q            .007
  1          Target4           tree            P            .051
  1          Target5           three           P            .056
  1          Target6           ten             Q            .058

任何帮助都会很棒。

谢谢。

标签: r

解决方案


如果我清楚地理解了你,你想realValue为每个选择最高的行,TargetWord并且如果realValue使用价值P超过Q.

利用"P" < "Q"我们可以做的事实 -

library(dplyr)

df %>%
  arrange(Condition, TargetWord, desc(realValue), WPcondition) %>%
  group_by(Condition, TargetWord) %>%
  slice(1L) %>%
  ungroup

#  Condition TargetWord WordProduced WPcondition realValue
#      <int> <chr>      <chr>        <chr>           <dbl>
#1         1 Target1    word         P               0.025
#2         1 Target2    life         Q               0.02 
#3         1 Target3    plane        Q               0.035
#4         2 Target4    tree         P               0.051
#5         2 Target5    three        P               0.056
#6         2 Target6    ten          Q               0.058

推荐阅读