python - 如何在 python 中的 SRT 文件中操作强?
问题描述
如果我有这样的 SRT 文件:
1
00:00:00672 --> 00:00:05568
This is about
2
00:00:05664 --> 00:00:11175
whatever
3
00:00:11303 --> 00:00:16359
I don't know
4
00:00:16423 --> 00:00:20647
you don't know
但是格式有问题,因为时间戳中缺少逗号,应该是这样的:
1
00:00:00,672 --> 00:00:05,568
This is about
2
00:00:05,664 --> 00:00:11,175
whatever
3
00:00:11,303 --> 00:00:16,359
I don't know
4
00:00:16,423 --> 00:00:20,647
you don't know
我怎样才能用python修复它?谢谢。
解决方案
您可以匹配格式的开头,并断言最后 3 位数字。
--> \d{2}:\d{2}:\d{2}(?=\d{3}\b)
并替换为完整匹配和逗号
r"\g<0>,"
import re
regex = r"--> \d{2}:\d{2}:\d{2}(?=\d{3}\b)"
s = ("1\n"
"00:00:00672 --> 00:00:05568\n"
"This is about\n\n"
"2\n"
"00:00:05664 --> 00:00:11175\n"
"whatever\n\n"
"3\n"
"00:00:11303 --> 00:00:16359\n"
"I don't know\n\n"
"4\n"
"00:00:16423 --> 00:00:20647\n"
"you don't know")
result = re.sub(regex, r"\g<0>,", s)
if result:
print (result)
输出
1
00:00:00672 --> 00:00:05,568
This is about
2
00:00:05664 --> 00:00:11,175
whatever
3
00:00:11303 --> 00:00:16,359
I don't know
4
00:00:16423 --> 00:00:20,647
you don't know
推荐阅读
- python-3.x - 如何告诉`ipywidgets.interactive`,它应该只考虑特定参数作为小部件?
- c++ - 不按特定顺序跳过多个循环(C++)
- azure - 如何结合 AD B2C(MSAL) 和 CosmosDB
- java - 在 Java/Groovy 中,catch(e) 是 catch(Exception e) 的简写吗?
- laravel - Laravel - 根据 id 从另一个表中获取数据
- ruby-on-rails - 用复杂的计算函数更新rails数据库整列
- sql-server - 尽管添加了 WHERE IS NULL 子句,但我的 NULL 记录没有显示
- sql - 在购买另一个产品后找到购买另一个产品的客户
- reactjs - 当用户使用反应单击删除按钮时,如何停止正在进行的上传文件方法?
- android - 从扫描活动中获取结果后在片段中设置编辑文本