首页 > 解决方案 > eXist-db - XQuery - Lucene - controlling output in KWIC function with callback parameter

问题描述

(This question is related to an attempt to implement the answer to a question someone else posed a few years ago at Ignored XML elements show up near eXist-db's lucene search results)

In eXist-db 4.4 I have the following Lucene index definition:

<collection xmlns="http://exist-db.org/collection-config/1.0">
  <index xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <lucene>
        <analyzer class="org.apache.lucene.analysis.standard.StandardAnalyzer"/>
        <text qname="tei:seg"/>
        <text qname="tei:persName"/>
        <text qname="tei:placeName"/>
        <ignore qname="tei:note"/>
        <ignore qname="tei:pb"/>
        <ignore qname="tei:gap"/>
        <ignore qname="tei:del"/>
        <ignore qname="tei:orig"/>
        <inline qname="tei:supplied"/>
    </lucene>
</index

It gets applied to content that always looks like this:

 <seg type="dep_event" subtype="event" xml:id="MS609-0209-2">Item. Alia 
   <del type="notary" rend="expunctus">die</del> vice vidit 
   <placeName type="event_loc" nymRef="#home_of_Cap-de-Porc">in eodem hospitio</placeName> 
   <persName nymRef="#Bernard_Cap-de-Porc_MSP-AU" role="her">Bernardum</persName>
   <note type="public">Assumed Bernard Cap-de-Porc based on <foreign xml:lang="LA">eodem hospitio</foreign>.</note> et socium suum, hereticos. Et vidit ibi cum eis 
   <persName nymRef="#Arnald_Godalh_MSP-AU" ana="#pAdo" role="par">Arnaldum<lb break="y" n="7"/>Godalh</persName>; et 
   <persName nymRef="#Guilhem_de_Rosengue_MSP-AU" ana="#pAdo #pBring" role="par">W<supplied reason="expname">illelmum</supplied>, 
        <roleName type="fam">filium dicti testis</roleName></persName>, 
     qui duxit ipsum testis ad dictos hereticos; et ipsum 
    <persName nymRef="#Peire_Cap-de-Porc_MSP-AU" ana="#pAdo" role="par">Cap de Porc</persName> et 
    <persName nymRef="#Susanna_Cap-de-Porc_MSP-AU" ana="#pAdo" role="par"/>uxorem eius. Et 
    <persName nymRef="#Arnald_de_Rosengue_MSP-AU" ana="#pAdo" role="par"/>ipse testis et omnes alii adoraverunt<lb break="y" n="8"/>ibi dictos hereticos. Et 
    <date type="event_date" when="1240">sunt V anni vel circa</date>.
  </seg>

My search focuses on the content found inside tei:seg, but I want to ignore certain elements found inside, such as tei:note and tei:del. The Lucene engine ignores those fields correctly. The query looks like this:

let $query := 
           <query>
             <term>hospitio</term>
           </query>

for $hit in collection($URIdata)//tei:text/tei:body//tei:seg[@type="dep_event"][ft:query(.,$query)]                                           

order by ft:score($hit) descending

return
    kwic:summarize($hit, <config width="80" table="yes"/>)

And the query returns the following through kwic:summarize function. Neither tei:note nor tei:del are being ignored....

<tr>
   <td class="previous">Item. Alia die vice vidit 
           in eodem </td>
   <td class="hi">hospitio</td>
   <td class="following">Bernardum Assumed Bernard Cap-de-Porc based on e...</td>
</tr>

According to eXist-db documentation (and this SO question) suppressing output to screen of these elements is controlled through an additional callback parameter. I tried to add a callback in the query:

kwic:summarize($hit, <config width="80" table="yes"/>, search:kwic-filter)

...which requests this function:

declare %private function search:kwic-filter($node as node(), $mode as xs:string) as xs:string? {
let $ignored-elements := doc(concat($globalvar:URIdb,"collection.xconf"))//*:ignore/@qname/string()
let $ignored-elements := 
    for $ignored-element in $ignored-elements
    let $ignored-element := substring-after($ignored-element, ':')
    return $ignored-element
return
    if (local-name($node/parent::*) = ($ignored-elements)) 
    then ()
    else 
        if ($mode eq 'before') 
        then concat($node, ' ')
        else concat(' ', $node)

...but I get the error

err:XPDY0002 Undefined context sequence for 'child::search:kwic-filter'

I'm missing how to communicate between the query and the callback, and perhaps how to write the callback function in this case (note that the call back function looks up the elements to ignore from the collection.xconf found at the top of this question).

标签: xpathlucenexqueryexist-db

解决方案


推荐阅读