首页 > 解决方案 > 捕获电影标题组

问题描述

我正在尝试从电影标题中捕获以下组:

file = "The Great Home Se01E01 Meatballs for Dinner"

<show> = "The Great Home"
<season> = "Se01"
<episode> = "E01"
<title> = "Meatballs for Dinner"

目前,我只部分设法捕获和使用以下代码:

import re

file = "The Great Home Se01E01 Meatballs for Dinner"
seasonEpID = re.search(r'(\bS/?.+\d{1,2})+(E/?.+\d{1,2})', file)
print(seasonEpID.groups())

它返回以下内容:

('Se01', 'E01')

一个人怎么能捕捉到四个组<show>, <season>, <episode>, <title>

标签: pythonregexre

解决方案


我将使用re.findall以下正则表达式模式:

^(.*?)\s+(Se\d+)(E\d+)\s+(.*)$

示例脚本:

file = "The Great Home Se01E01 Meatballs for Dinner"
parts = re.findall(r'^(.*?)\s+(Se\d+)(E\d+)\s+(.*)$', file)
print(parts)

这打印:

[('The Great Home', 'Se01', 'E01', 'Meatballs for Dinner')]

推荐阅读