首页 > 解决方案 > R:(SQL 风格)%LIKE% 语句

问题描述

我试图弄清楚如何在 R 编程语言中使用类似于 %LIKE% 的语句。使用以下 stackoverflow 帖子:如何加入(合并)数据帧(内、外、左、右),我能够弄清楚如何在 SQL 中运行基本合并。但由于某种原因,这不适用于 %LIKE% 条件。

例如,如果我创建以下数据:

table_a <- data.frame (

"name" = c("John", "ALex", "ToM", "Kev", "Peter"),
"color" = c("red", "blue", "green", "yellow", "pink")

)

name$table_a = as.factor(name$table_a)
    
table_b <- data.frame (

"name" = c("Johnathan", "Alexander", "Tomas", "Kevin", "Luke", "Ryan"),
"food" = c("pizza", tacos", "sushi", "cake", "brownies", "burgers")

)

name$table_b = as.factor(name$table_b)

table_c <- data.frame (

"name" = c("Johnatha", "Alexande1", "Toma1", "Kevi1", "Luk1"),
"food" = c("pizza", tacos", "sushi", "cake", "brownies")

)

name$table_c = as.factor(name$table_c)

现在我想做的是,如果 table_a 中的名称包含在 table_b 中的名称中的某处,则运行“左连接”。(使用相同的逻辑,也应该可以使用单面的 %LIKE 吗?)

#Left joins

join_1 =  merge(x = table_a, y = table_b, by = "%name%", all.x = TRUE)

join_2 =  merge(x = table_b, y = table_c, by = "%name", all.x = TRUE)

在常规 SQL 语句中,如果数据行满足 %LIKE% 指定的条件,通常可以直接选择数据行。在 R 中是否可能发生同样的事情?

# select using %LIKE% (is there a way to override "case sensitivity" ? e.g. %like% "jOn"?)

selected_1 = table_a[name %like% "Jon"|| "Ale" || "Pet"]
selected_2 = table_a[name %like% "Jon"|| "Ale" || "Pet" || color %like% "ye"]

谢谢

标签: sqlrselectmergedata-manipulation

解决方案


我认为您可能需要使用其他 R 函数来实现原生无法使用merge. grepl是检查一个字符串是否在另一个字符串中找到的主要函数。如果您想要其他模式,您可以使用startsWith( LIKE%) 或endsWith( %LIKE),而不是grepl.

#sapply iterates over the names of table_a
table_a$name_b <- sapply(table_a$name, function(x)  {
                          #check to see if the names of table_a 
                          #are included within table_b
                          #which(...)[1] selects the first instance
                          check <- which(grepl(x, table_b$name, ignore.case = TRUE))[1]
                          #filter table_b and return what matched.
                          table_b$name[check]
                        })

输出:

# table_a
#   name  color    name_b
#1  John    red Johnathan
#2  Alex   blue Alexander
#3   Tom  green     Tomas
#4   Kev yellow     Kevin
#5 Peter   pink     <NA>
                  

推荐阅读