首页 > 解决方案 > 使用 Postgresql 进行多语言全文搜索

问题描述

用户输入可以是英语或意大利语。这些数据都是英文和(大部分)意大利文的。以下是我的查询(似乎有效),我的问题是这是否是处理未知语言输入的正确方法。(在示例中,用户输入单词“wine”):

    SELECT id, name
    FROM (
        SELECT p.id, p.name,
                to_tsvector('italian', p.name) || --some data are only in italian
                to_tsvector('italian', cat.category) || 
                to_tsvector((CASE WHEN de.language = 'ITA' THEN 'italian' ELSE 'english' END)::regconfig, coalesce(string_agg(de.descr, ' '))) as document 

        FROM myschema.product p
        INNER JOIN myschema.disc d ON d.id_disc = p.id_disc
        INNER JOIN myschema.disc_city dc ON dc.id_disc = d.id_disc
        INNER JOIN myschema.city c ON c.id_city = dc.id_city 
        INNER JOIN myschema.category cat ON cat.id_category = d.id_category
        INNER JOIN myschema.product_desc pd ON pd.id = p.id --One p.id to Many pd.id, a product can have multiple descriptions
        INNER JOIN myschema.descr de ON de.id_descr = pd.id_descr
        GROUP BY p.id, p.name, cat.category, de.language    
    ) p_search
--handling input 'wine' of unknown language (could be too the italian 'vino')
    WHERE p_search.document @@ to_tsquery('italian', 'wine') OR
        p_search.document @@ to_tsquery('english', 'wine');
    GROUP BY id, name

标签: postgresqlsearchfull-text-searchmultilingual

解决方案


您可以使用“简单”字典,对此进行测试:

SELECT to_tsvector('english', 'The wine is good');
SELECT to_tsvector('italian', 'The wine is good');
SELECT to_tsvector('simple', 'The wine is good');

SELECT to_tsvector('english', 'Il vino è buono');
SELECT to_tsvector('italian', 'Il vino è buono');
SELECT to_tsvector('simple', 'Il vino è buono');

推荐阅读