python - 如何在一定时间范围内从文本中提取
问题描述
我在下面有一段文字,如何提取时间范围之间的文字。代码可用于提取所有值
s = '''00:00:14,099 --> 00:00:19,100
a classic math problem a
00:00:17,039 --> 00:00:28,470
will come from an unexpected place
00:00:18,039 --> 00:00:19,470
00:00:20,039 --> 00:00:21,470
00:00:22,100 --> 00:00:30,119
binary numbers first I'm going to give
00:00:30,119 --> 00:00:35,430
puzzle and then you can try to solve it
00:00:32,489 --> 00:00:37,170
like I said you have a thousand bottles'''
我可以从00:00:17,039 --> 00:00:28,470
和中提取测试吗00:00:30,119
写回所有值的代码
import re
lines = s.split('\n')
dict = {}
for line in lines:
is_key_match_obj = re.search('([\d\:\,]{12})(\s-->\s)([\d\:\,]{12})', line)
if is_key_match_obj:
#current_key = is_key_match_obj.group()
print (current_key)
continue
if current_key:
if current_key in dict:
if not line:
dict[current_key] += '\n'
else:
dict[current_key] += line
else:
dict[current_key] = line
print(dict.values())
预计从00:00:17,039 --> 00:00:28,470
到00:00:30,119 --> 00:00:35,430
dict_values(['will come from an unexpected place ', '', '', 'binary numbers first I'm going to give', ' puzzle and then you can try to solve it'])
解决方案
无需逐行迭代。试试下面的代码。它会给你一本你想要的字典。
import re
dict = dict(re.findall('(\d{2}:\d{2}.*)\n(.*)', s))
print(dict.values())
输出
dict_values(['a classic math problem a', 'will come from an unexpected place', '', '', "binary numbers first I'm going to give", 'puzzle and then you can try to solve it', 'like I said you have a thousand bottles'])
推荐阅读
- javascript - 使用 JWT 身份验证对 DRF 的 axios 请求失败
- asp.net-mvc - 将数据列表从组件发布到新组件并在 Blazor 中呈现新组件
- python - eli5 permuter.feature_importances_ 返回全零
- reactjs - 使用 React Hook 表单 React Dropzone
- java - 如何在没有 Class.forName 或 DriverManager.registerDriver 的情况下自动加载 JDBC Wrapper Driver 类
- python - 使用 PyTorch,当我有填充时,我的 Conv1d 维度如何减少?
- javascript - 为什么这个 React 状态数组不会填充对象数据?
- python-3.x - 通过将列值乘以另一个数据框来创建多索引数据框
- javascript - 从 POST 请求正文 NodeJS 和 Angular 获取文件
- swift - 我刚刚将 MapKit 添加到我的应用程序中,并且有一个带有覆盖路线的 mapView,如何将带有覆盖的 mapView 保存为图像?