首页 > 解决方案 > 带有 jq 的模糊匹配字符串

问题描述

假设我在一个文件中有一些 JSON,它是从较大的 JSON 文件中提取的 JSON 数据的子集——这就是我稍后将stream在我尝试的解决方案中使用的原因——它看起来像这样:

[
{"_id":"1","@":{},"article":false,"body":"Hello world","comments":"3","createdAt":"20201007200628","creator":{"id":"4a7ba8fd719d43598b977dd548eed6aa","bio":"","blocked":false,"followed":false,"human":false,"integration":false,"joined":"20201007200628","muted":false,"name":"mkscott","rss":false,"private":false,"username":"mkscott","verified":false,"verifiedComments":false,"badges":[],"score":"0","interactions":258,"state":1},"depth":"0","depthRaw":0,"hashtags":[],"id":"2d4126e342ed46509b55facb49b992a5","impressions":"3","links":[],"sensitive":false,"state":4,"upvotes":"0"},
{"_id":"2","@":{},"article":false,"body":"Goodbye world","comments":"3","createdAt":"20201007200628","creator":{"id":"4a7ba8fd719d43598b977dd548eed6aa","bio":"","blocked":false,"followed":false,"human":false,"integration":false,"joined":"20201007200628","muted":false,"name":"mkscott","rss":false,"private":false,"username":"mkscott","verified":false,"verifiedComments":false,"badges":[],"score":"0","interactions":258,"state":1},"depth":"0","depthRaw":0,"hashtags":[],"id":"2d4126e342ed46509b55facb49b992a5","impressions":"3","links":[],"sensitive":false,"state":4,"upvotes":"0"}
],
[
{"_id":"55","@":{},"article":false,"body":"Hello world","comments":"3","createdAt":"20201007200628","creator":{"id":"3a7ba8fd719d43598b977dd548eed6aa","bio":"","blocked":false,"followed":false,"human":false,"integration":false,"joined":"20201007200628","muted":false,"name":"mkscott","rss":false,"private":false,"username":"jkscott","verified":false,"verifiedComments":false,"badges":[],"score":"0","interactions":258,"state":1},"depth":"0","depthRaw":0,"hashtags":[],"id":"2d4126e342ed46509b55facb49b992a5","impressions":"3","links":[],"sensitive":false,"state":4,"upvotes":"0"},
{"_id":"56","@":{},"article":false,"body":"Goodbye world","comments":"3","createdAt":"20201007200628","creator":{"id":"3a7ba8fd719d43598b977dd548eed6aa","bio":"","blocked":false,"followed":false,"human":false,"integration":false,"joined":"20201007200628","muted":false,"name":"mkscott","rss":false,"private":false,"username":"jkscott","verified":false,"verifiedComments":false,"badges":[],"score":"0","interactions":258,"state":1},"depth":"0","depthRaw":0,"hashtags":[],"id":"2d4126e342ed46509b55facb49b992a5","impressions":"3","links":[],"sensitive":false,"state":4,"upvotes":"0"}
]

它描述了由 2 位不同作者撰写的 4 篇文章,_id每篇文章都有唯一的字段。两位作者都写了 2 个帖子,其中 1 个说​​“Hello World”,另一个说“Goodbye World”。

我想匹配单词“Hello”并返回_id唯一包含“Hello”的字段。预期结果是:

1
55

我最接近的尝试是:

jq -nr --stream '
fromstream(1|truncate_stream(inputs))
| select(.body %like% "Hello")
| ._id
' <input_file

标签: jsonselectjqstring-matching

解决方案


您描述的任务类型通常不需要流解析器(使用 --stream 调用),因此在此响应中,我将假设以下(或其变体)就足够了:

.[]
| select( .body | test("Hello") )._id

这当然假设输入是有效的 JSON。

处理逗号分隔的 JSON

如果您的输入是以逗号分隔的 JSON 流,如 Q 所示,您可以将以下内容与 -n 命令行选项结合使用:

# This is a variant of the built-in `recurse/1`:
def iterate(f): def r: f | (., r); r;

iterate( inputs? | .[] | select( .body | test("Hello") )._id )

请注意,这假定可以忽略分隔逗号之后的行上的任何内容。


推荐阅读