首页 > 解决方案 > Solr:具有多值字段的 mysql 查询的 DIH?

问题描述

我正在尝试在 Solr 中设置一个多值字段,但在我的情况下它失败了!!

数据库查询结果(示例)

|id  | another_id    | name          | phone       | type        |
|----------------------------------------------------------------|
|'1' | '11'          | 'F. Brown'    | '112233440' | 'employee'  |
|'2' | '22'          | 'Jhon Smith'  | '123123123' | 'guest'     |
|'2' | '22'          | 'Jhon Smith'  | '321321321' | 'guest'     |

Solr-data-config.xml

<?xml version="1.0" encoding="UTF-8"?>
<dataConfig>
  <dataSource   type="JdbcDataSource"
                driver="com.mysql.jdbc.Driver"
                url="jdbc:mysql://localhost:3306/servme_prd"
                user="root"
                password="root" />
  <document>
    <entity name="person_cards" query="SELECT table1.id, table2.id AS another_id, table1.name, table2.phone, table1.type 
        FROM table1
        INNER JOIN table2 ON table1.id = table2.fk_id">
        <field column="id" name="uid" />
        <field column="another_id" name="pid" />
        <field column="name" name="name" />
        <field column="phone" name="phone" />
        <field column="type" name="type"/>
    </entity>
</document>
</dataConfig>

托管架构.xml

<uniqueKey>uid</uniqueKey>
<field name="_version_" type="plong" indexed="false" stored="false"/>
<field name="uid" type="string" docValues="false" multiValued="false" indexed="true" required="true" stored="true"/>
<field name="pid" type="string" docValues="false" multiValued="false" indexed="true" required="true" stored="true"/>
<field name="name" type="string" indexed="true" stored="true"/>
<field name="phone" type="string" docValues="false" multiValued="true" indexed="true" stored="true"/>
<field name="type" type="string" indexed="true" stored="true"/>

每当我进行完全导入时,我都没有将电话作为多值字段;示例 solr 查询响应:

{
    "name":"F. Brown",
    "uid":"1",
    "pid":"11",
    "phone":["112233440"],
    "type":"employee" 
    "_version_":1608065390436417536
},
{
    "name":"Jhon Smith",
    "uid":"2",
    "pid":"22",
    "phone":["123123123"],
    "type":"guest" 
    "_version_":1608065390436417536
},
{
    "name":"Jhon Smith",
    "uid":"2",
    "pid":"22",
    "phone":["321321321"],
    "type":"guest" 
    "_version_":1608065390436417536
}

我想从 solr 查询搜索中获得以下响应:

{
    "name":"F. Brown",
    "uid":"1",
    "pid":"11",
    "phone":["112233440"],
    "type":"employee" 
    "_version_":1608065390436417536
},
{
    "name":"Jhon Smith",
    "uid":"2",
    "pid":"22",
    "phone":["123123123", "321321321"],
    "type":"guest" 
    "_version_":1608065390436417536
}

solr 配置部分缺少任何东西,所以我没有让多值字段按预期工作?

顺便说一句,我正在使用安装在 ubuntu 14 服务器上的 Solr 7.4。谢谢

标签: solrdih

解决方案


由于您使用的是 MySQL,因此快速修复是使用GROUP_CONCAT然后使用 DIH 的 RegexTransformer拆分列:

<entity transformer="RegexTransformer" name="person_cards" query="SELECT 
        table1.id, 
        table2.id AS another_id, 
        table1.name, 
        GROUP_CONCAT(table2.phone) AS phone, table1.type 
    FROM table1
    INNER JOIN table2 ON table1.id = table2.fk_id
    GROUP BY uid
    ">
    ...
    <field column="phone" name="phone" splitBy="," />
    ...
</entity>

推荐阅读