首页 > 解决方案 > 当我使用 pivot_longer 按名称排除多个列时出现一元运算符错误

问题描述

我有一个计算均值的数据框及其各自的标准误差

    Experiment Stim      Status Treatment  Time Count CD1.CD2.Freq.Mean CD1.CD2.Freq.se
31        125   NS     Control      None  1hr     8           0.244375     0.0385273268
32        125   NS     Control      None  2hr     8           0.303000     0.0296515478
33        125   NS     Control   1,25-VD  1hr     8           0.257625     0.0344901319
34        125   NS     Control   1,25-VD  2hr     8           0.280750     0.0337827883
35        125   TT     Control      None  1hr     8           0.944375     0.0985273268
36        125   TT     Control      None  2hr     8           0.933000     0.0696515478
37        125   TT     Control   1,25-VD  1hr     8           0.127625     0.0444901319
38        125   TT     Control   1,25-VD  2hr     8           0.100750     0.0137827883

我正在尝试使用 pivot_longer,以便可以在一个 ggplot 上绘制多个 Freq.Mean。我首先删除了我不感兴趣的列(实验和计数)。

我现在需要保留列“Stim”、“Status”、“Treatment”和“Time”,同时为我感兴趣的标记(例如“CD1.CD2”=“Marker”)和观察值“Freq.平均值”=“频率”及其相应的标准误差我计算了“Freq.se”=“se”。我有更多感兴趣的标记(“CD1.CD3”由相同的“.Freq.Mean”或“.Freq.se”分隔)。

我的尝试:

CD1.Freq.Means.125Long <- CD1.Freq.Means.125 %>%
  select(-c("Experiment", "Count"))
  pivot_longer(
    cols = -c("Stim", "Status", "Treatment", "Time"),
    names_to = c("Marker"),
    names_pattern = c(".Freq.Mean", ".Freq.se"),
    values_to = c("Frequency", "se")
  )

我收到此错误:

Error in -c("Stim", "Status", "Treatment", "Time") : 
  invalid argument to unary operator

我对 R 和 StackOverflow 相当陌生,所以如果我没有完全正确地提供可重现的示例,我深表歉意。

标签: rtidyverse

解决方案


也许试试这种方法:

library(tidyverse)
#Code
newdf <- df %>% select(-c(Experiment, Count)) %>%
  pivot_longer(-c(Stim,Status,Treatment,Time)) %>%
  mutate(name=gsub('.Freq','_Freq',name,fixed=T)) %>%
  separate(name,c('Marker','Var'),sep='_') %>%
  pivot_wider(names_from = Var,values_from=value) %>%
  rename(Frequency=Freq.Mean)

输出:

# A tibble: 8 x 7
  Stim  Status  Treatment Time  Marker  Frequency Freq.se
  <chr> <chr>   <chr>     <chr> <chr>       <dbl>   <dbl>
1 NS    Control None      1hr   CD1.CD2     0.244  0.0385
2 NS    Control None      2hr   CD1.CD2     0.303  0.0297
3 NS    Control 1,25-VD   1hr   CD1.CD2     0.258  0.0345
4 NS    Control 1,25-VD   2hr   CD1.CD2     0.281  0.0338
5 TT    Control None      1hr   CD1.CD2     0.944  0.0985
6 TT    Control None      2hr   CD1.CD2     0.933  0.0697
7 TT    Control 1,25-VD   1hr   CD1.CD2     0.128  0.0445
8 TT    Control 1,25-VD   2hr   CD1.CD2     0.101  0.0138

使用的一些数据:

#Data
df <- structure(list(Experiment = c(125L, 125L, 125L, 125L, 125L, 125L, 
125L, 125L), Stim = c("NS", "NS", "NS", "NS", "TT", "TT", "TT", 
"TT"), Status = c("Control", "Control", "Control", "Control", 
"Control", "Control", "Control", "Control"), Treatment = c("None", 
"None", "1,25-VD", "1,25-VD", "None", "None", "1,25-VD", "1,25-VD"
), Time = c("1hr", "2hr", "1hr", "2hr", "1hr", "2hr", "1hr", 
"2hr"), Count = c(8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L), CD1.CD2.Freq.Mean = c(0.244375, 
0.303, 0.257625, 0.28075, 0.944375, 0.933, 0.127625, 0.10075), 
    CD1.CD2.Freq.se = c(0.0385273268, 0.0296515478, 0.0344901319, 
    0.0337827883, 0.0985273268, 0.0696515478, 0.0444901319, 0.0137827883
    )), class = "data.frame", row.names = c("31", "32", "33", 
"34", "35", "36", "37", "38"))

推荐阅读