首页 > 解决方案 > 不固定时的正则表达式

问题描述

我有以下形式的数据:

{"product": [{ "id": "", "name": "some text", "purchased_at": ""} , {..}, {..}]}
{"product": [{ "name": "", "id": "some text", "purchased_at": ""} , {..}, {..}]}
{"product": [{ "purchased_at": "", "id": "some text", "name": ""} , {..}, {..}]}
...

键的顺序不固定,我创建的正则表达式无法捕获其他数据格式:

"name":\s*"(.*?)","purchased_at":\s*"(.*?)",.*?"id":\s*"(.*?)"

如何修改它以包括订单更改?

标签: pythonregex

解决方案


尝试这个:

m = re.search('^(?=.*"name":\s*"(?P<name>.*?)")(?=.*"id":\s*"(?P<id>.*?)")(?=.*"purchased_at":\s*"(?P<purchased_at>.*?)").*', input)
dict = {"name":m.group('name'), "id":m.group('id'), "purchased_at":m.group('purchased_at')}

这使用单独的前瞻来分别捕获所有键/值,因此它们在输入中的顺序无关紧要,但是对组进行命名,以便可以按名称访问它们,而不是按其位置的通常方式访问它们。


>>> m = re.search('^(?=.*"name":\s*"(?P<name>.*?)")(?=.*"id":\s*"(?P<id>.*?)")(?=.*"purchased_at":\s*"(?P<purchased_at>.*?)").*', '{"product": [{ "id": "id1", "name": "name1", "purchased_at": "pa1"} , {..}, {..}]}')
>>> dict={"name":m.group('name'), "id":m.group('id'), "purchased_at":m.group('purchased_at')}
>>> print  dict
{'purchased_at': 'pa1', 'name': 'name1', 'id': 'id1'}

推荐阅读