python - 将字符串拆分为子字符串

问题描述

我想将一个字符串拆分为代表每个字段的每个字符串：

name,city,points,score,cards

我有这些字符串：

Paul Grid - Hong Kong 56  663 0
Anna Grid - Tokyo 16  363 0
Greg H.Johs - Hong Kong -6  363 4
Jessy Holm Smith - Jakarta 8  261 0

格式为：

Name[SPACE]-[SPACE]City[SPACE]-Points[SPACE][SPACE]Score[SPACE]Cards

名称可以有空格和'.' 在里面
城市可以有空间
ex Score 和 Points 之间有时有两个空格
分数、点数、卡片可以是负数

我想在 Python 中实现的规则如下：

Name : From beginning, until you see "-" - and then strip trailing space from that string.
Cards: From end and back, until you meet the first space
Score: From the space you hit when you made card, go back until next space.
Points: From the space you hit when you made Score, go back until next space.
City: where Name ended and where the Points stopped after seeing the space.

我的问题是我不能只替换空格作为分隔符，因为空格可以在名称和城市中，并且“-”用于分隔名称和城市。

我可以粗暴地做到这一点，并逐个字符地遍历字符，但想知道 Python 是否有一种聪明的方式来做到这一点？

我的最终结果希望是将每一行分成多个字段，这样我就可以解决前 scorerecord.name、scorerecord.city 等问题。

标签： pythonpython-3.x

只是另一个正则表达式模式：

import re

text = """Paul Grid - Hong Kong 56  663 0
Anna Grid - Tokyo 16  363 0
Greg H.Johs - Hong Kong -6  363 4
Jessy Holm Smith - Jakarta 8  261 0"""


print()
pat = r'^([^-]+) - ?([^-]+?)(?= -?\d+) (-?\d+) +(-?\d+) +(-?\d+)$'

for k in re.findall(pat,text,re.MULTILINE):
    print(k)

导致输出：

('Paul Grid', 'Hong Kong', '56', '663', '0')
('Anna Grid', 'Tokyo', '16', '363', '0')
('Greg H.Johs', 'Hong Kong', '-6', '363', '4')
('Jessy Holm Smith', 'Jakarta', '8', '261', '0')

解释：

文本部分'([^-]+) - ?([^-]+?)'是用它们之间的“一个或多个其他任何东西 -”来捕获的' - '。
必须遵循第二个文本'(?= -?\d+)'：a（可选）-和 number(s) via positive lookahead。
然后使用' (-?\d+)', 再次使用可选符号捕获数字。全部都必须在一行内'^ .... $'，并激活多行。

python - 将字符串拆分为子字符串

问题描述

解决方案

推荐阅读