首页 > 解决方案 > PostgreSQL全文搜索找不到“andy”

问题描述

我有这个 PostgreSQL 查询:

SELECT d.user_id, display_name, avatar_url
FROM user_directory_search
WHERE
user_id like '@and%';

我得到这些结果:

                    user_id             | display_name | avatar_url
----------------------------------------+--------------+------------
 @andy.huang:synapse.siliconmotion.com  |              |
 @andy.zhao:synapse.siliconmotion.com   | Andy.zhao    |
 @andy.yao:synapse.siliconmotion.com    |              |
 @andy.zou:synapse.siliconmotion.com    |              |
 @andy.xie:synapse.siliconmotion.com    |              |
 @andy.chang:synapse.siliconmotion.com  | andy.chang   |
 @andy.chuang:synapse.siliconmotion.com | andy.chuang  |
 @andy.hsiao:synapse.siliconmotion.com  |              |
(8 rows)

但是当我使用命令时:

SELECT d.user_id, display_name, avatar_url
FROM user_directory_search
WHERE
vector @@ to_tsquery('english', '(andy:* | andy)');

我什么都没有:

 user_id | display_name | avatar_url
---------+--------------+------------
(0 rows)

有谁知道原因?

标签: postgresqlsearchfull-text-search

解决方案


问题是全文解析器将这些字符串解析为主机名:

SELECT alias, description, token, lexemes
FROM ts_debug('english', '@andy.huang:synapse.siliconmotion.com')
WHERE alias <> 'blank';

 alias | description |           token           |           lexemes           
-------+-------------+---------------------------+-----------------------------
 host  | Host        | andy.huang                | {andy.huang}
 host  | Host        | synapse.siliconmotion.com | {synapse.siliconmotion.com}
(2 rows)

您可以在索引期间用空格替换违规时段:

SELECT alias, description, token, lexemes
FROM ts_debug('english',
              translate('@andy.huang:synapse.siliconmotion.com', '.', ' '))
WHERE alias <> 'blank';

   alias   |   description   |     token     |   lexemes    
-----------+-----------------+---------------+--------------
 asciiword | Word, all ASCII | andy          | {andi}
 asciiword | Word, all ASCII | huang         | {huang}
 asciiword | Word, all ASCII | synapse       | {synaps}
 asciiword | Word, all ASCII | siliconmotion | {siliconmot}
 asciiword | Word, all ASCII | com           | {com}
(5 rows)

但如果我是你,我会使用simple全文搜索配置。或者你想要词干(比较上面的“token”和“lexemes”)?


推荐阅读