首页 > 解决方案 > 搜索具有未过滤选项的文档,其值以包含特殊字符的字符串开头?

问题描述

我正在寻找获取“关键”属性值以包含特殊字符“/”的字符串开头的文档的估计

let query = 
    cts.andQuery([
        cts.jsonPropertyWordQuery("Key", "IBD/info/*", ["lang=en"],1), 
        cts.collectionQuery("documentCollection")
    ], [])

cts.estimate(query)

但是 word-query() 在内部将 "IBD/info/" 标记为(cts:word("IBD"), cts:punctuation("/"), cts:word("info"), ...)

我创建了 FIELD,详细信息如下

"field": [
        {
            "field-name": "key",
            "field-path": [
                {
                    "path": "/envelope/instance/Key",
                    "weight": 1
                }
            ],
            "stemmed-searches": "advanced",
            "field-value-searches": true,
            "field-value-positions": true,
            "trailing-wildcard-searches": true,
            "trailing-wildcard-word-positions": true,
            "tokenizer-override": [
                {
                    "character": "/",
                    "tokenizer-class": "word"
                }
            ]
        }
]

并尝试了以下查询,但我仍然得到误报结果

cts:search(
  fn:doc(),
  cts:and-query((
      cts:field-value-query("key","IBD/info/*"),
      cts:collection-query("documentCollection")
  )),
  "unfiltered"
)

我该如何处理这种情况?

标签: marklogicmarklogic-10

解决方案


使用以下详细信息创建字段

"field": [
    {
      "field-name": "key",
      "field-path": [
        {
          "path": "/envelope/instance/Key",
          "weight": 1
        }
      ],
      "field-value-searches": true,
      "trailing-wildcard-searches": true,
      "three-character-searches": false,
      "tokenizer-override": [
        {
          "character": "/",
          "tokenizer-class": "word"
        },
        {
          "character": "_",
          "tokenizer-class": "word"
        }
      ]
    }
  ],
  "range-field-index": [
    {
      "scalar-type": "string",
      "field-name": "key",
      "collation": "http://marklogic.com/collation/",
      "range-value-positions": false,
      "invalid-values": "reject"
    }
  ]

重新索引完成后,查询如下

let query = 
    cts.andQuery([
        cts.fieldValueQuery("key", "IBD/info/*"),
        cts.collectionQuery("documentCollection")
    ], [])
cts.search(query,"unfiltered")

然后查询将仅获取具有以“IBD/info/”开头的“Key”值的文档


推荐阅读