首页 > 解决方案 > 解析 R 中的特定元素

问题描述

V21.X1  V21.X2
A       02:01:03
A       02:01:04
A       03:01:05
A       03:01:04

使用which我从数据框中获取向量或列并将其拆分为两个新列(如上所示),但我想过滤包含 02:01 的行。我尝试V21.X1使用 再次单独拆分which,但 R 不会将V21.X2其视为单独的列,而是将其视为组合的X1一部分X2or V21

我想将此输出存储在另一个变量中:

V21.X1  V21.X2   
A       02:01:03
A       02:01:04

标签: rparsing

解决方案


We can use a regex to match the pattern for filtering the rows

subset(df1, grepl("^02:01", V21.X2))
#   V21.X1   V21.X2
#1      A 02:01:03
#2      A 02:01:04

Or extract with substr and then do a ==

subset(df1, substr(V21.X2, 1, 5)=='02:01')
#  V21.X1   V21.X2
#1      A 02:01:03
#2      A 02:01:04

If the dataset is a matrix column, there is only a sngle column i.e. 'V21' which store the matrix with two columns 'X1' and 'X2'

m1 <- cbind(X1 = "A", X2 = c("02:01:03", "02:01:04", "03:01:05", "03:01:04"))
df1 <- data.frame(V21 = rep(NA, 4))
df1$V21 <- m1
subset(df1,  grepl("^02:01", V21.X2))

Error in grepl("^02:01", V21.X2) : object 'V21.X2' not found

A solution would be to convert the columns to normal columns in the data.frame and do the subset

df2 <- do.call(data.frame, df1)
subset(df2, grepl("^02:01", V21.X2))
#   V21.X1   V21.X2
#1      A 02:01:03
#2      A 02:01:04

data

df1 <- structure(list(V21.X1 = c("A", "A", "A", "A"), V21.X2 = c("02:01:03", 
 "02:01:04", "03:01:05", "03:01:04")), .Names = c("V21.X1", "V21.X2"
 ), class = "data.frame", row.names = c(NA, -4L))

推荐阅读