首页 > 解决方案 > 从结果表中删除不相关的数据 - 已编辑

问题描述

我使用了一个文章数据库来检查哪十个人被提及最多。但有些结果是无关紧要的。我在一个结果上使用了以下代码 - 它起作用了 - 不相关的结果被删除了。

  person,
  COUNT(1) AS count_mentions,
  COUNT(DISTINCT url) AS count_distinct_urls
FROM
  `myproject.mytable.schema`
  WHERE lower(PERSON) not like '%irrelevant_results1%'
GROUP BY
  person
ORDER BY
  count_mentions DESC
LIMIT
  10;

但是当我想用它来删除所有其他不相关的结果时——它没有用,它只删除了前两个不相关的结果,而不是第三个不相关的结果。

你能帮我找出问题所在吗?

谢谢你们!

SELECT
  person,
  COUNT(1) AS count_mentions,
  COUNT(DISTINCT url) AS count_distinct_urls
FROM
  `myproject.mytable.schema`
  WHERE 
(lower(PERSON) not like '%irrelevant_results1%' and
  lower(PERSON) not like '%irrelevant_results2%' and
  lower(PERSON) not like '%irrelevant_results3%' )
GROUP BY
  person
ORDER BY
  count_mentions DESC
LIMIT
  10;

标签: google-bigquery

解决方案


推荐阅读