regex - How to grep only the desired position match in a single line, where there is multiple matches, using regex?
问题描述
I have a file with hundreds of links of the form:
https://file1.mp4" target='_blank'>HD-MQ</a> | <a href="https://file1_v2.mkv
And, sometimes, the end of the line has mp4
instead of mkv
, like below:
https://file1.mp4" target='_blank'>HD-MQ</a> | <a href="https://file1_v2.mp4
I already tried 'http.+mp4'
pattern to get a single url, or with mkv
at the end, but it keeps printing that whole line, because '.+' will do just that, return the phrases that start with http
and ends with mp4
.
How could specify the regex (using grep) to match only one of the urls, without that html garbage in the middle?
The final result needs to be https://file1.mp4
or https://file1_v2.mkv
, with me specifying which one I want.
解决方案
您可以在模式中排除双引号:
grep -o 'https:\/\/[^"]*\.mp4' file
grep -o 'https:\/\/[^"]*\.mkv' file
或两种类型
grep -E -o 'https:\/\/[^"]*\.(mp4|mkv)' file
推荐阅读
- javascript - 存储会话参数不可访问
- r - 添加geom_point时ggplot地图发生变化
- r - 对于小标题R的循环列表
- node.js - Node.js express-session 在 HTTP 上正常,在 HTTPS、localhost、自签名证书中未恢复会话
- javascript - 是否可以更改定义为其他两个变量 1 let 和 1 const 之间差异的 let 变量的值?
- cookies - 使用 Google Cloud CDN 签名的 Cookie 以 403 响应
- java - 如何在离线环境中使用 gradle 为 Spring Boot 创建可执行 Jar
- python - 连续处理opencv帧并使用gstreamer rtsp按需显示
- optimization - 如何在优化问题中将整数变量的约束减少为二进制变量
- c# - 构建统一游戏时出现“当前上下文中不存在名称“资产数据库””错误