首页 > 解决方案 > 正则表达式-如何在 BigQuery 中匹配单词 apple 而不是 pine_apple

问题描述

有两个篮子。Basket_1 包含苹果、芒果和橙子。Basket_2 包含 2 个苹果和 2 个 pine_apples。正则表达式模式“apple”匹配单词 apple 以及 pine_apple。请说清楚。

#standardSQL
with table1 as(
SELECT "basket_1" as basket,"apple" as fruit UNION ALL
SELECT "basket_1","mango" as fruit UNION ALL
SELECT "basket_2","apple" as fruit UNION ALL
SELECT "basket_2","apple" as fruit UNION ALL
SELECT "basket_2","pine_apple" as fruit UNION ALL
SELECT "basket_2","pine_apple" as fruit UNION ALL
SELECT "basket_1","orange" as fruit 
)
SELECT basket,string_agg(fruit)fruits_in_each_basket,regexp_extract_all(string_agg(fruit),r'(?i)apple')apple from table1 group by basket

标签: google-bigquery

解决方案


这是不使用正则表达式的替代版本。它依赖于ARRAY_AGG一个条件来评估NULL水果是否不是苹果,然后跳过将这些字符串添加到数组中:

#standardSQL
with table1 as(
SELECT "basket_1" as basket,"apple" as fruit UNION ALL
SELECT "basket_1","mango" as fruit UNION ALL
SELECT "basket_2","apple" as fruit UNION ALL
SELECT "basket_2","Apple" as fruit UNION ALL
SELECT "basket_2","pine_apple" as fruit UNION ALL
SELECT "basket_2","pine_apple" as fruit UNION ALL
SELECT "basket_1","orange" as fruit 
)
SELECT
  basket,
  STRING_AGG(fruit) AS fruits_in_each_basket,
  ARRAY_AGG(IF(LOWER(fruit) = 'apple', fruit, NULL) IGNORE NULLS) AS apple 
FROM table1
GROUP BY basket

推荐阅读