shell - 使用 wget 排除域和文件类型
问题描述
我找到了很多关于它的信息,但我无法让 wget 排除域和文件扩展名。
我有一个包含许多 url 的 .txt 文件。我想避免下载某些域的图像(jpg、png、gif),也避免下载 html 或链接文件。
使用以下命令,我下载了 file.txt 中的所有内容
wget -i file.txt
在文件中,我有以下网址
https://feedly.com/
http://img2.rtve.es/v/3195388?w=1600&preview=1435846554460.jpg
https://images.vexels.com/media/users/3/127855/isolated/preview/c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png
https://upload.wikimedia.org/wikipedia/commons/2/2c/Rotating_earth_%28large%29.gif
为了排除我试过的域wget -i file.txt --exclude-domains img2.rtve.es
。结果没有错误
wget -i file.txt --exclude-domains img2.rtve.es
--2018-05-18 16:29:54-- https://feedly.com/
Resolving feedly.com... 104.20.60.241, 104.20.59.241
Connecting to feedly.com|104.20.60.241|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’
index.html [ <=> ] 15.45K --.-KB/s in 0.03s
2018-05-18 16:29:55 (616 KB/s) - ‘index.html’ saved [15821]
--2018-05-18 16:29:55-- http://img2.rtve.es/v/3195388?w=1600&preview=1435846554460.jpg
Resolving img2.rtve.es... 8.252.16.124, 8.253.165.245, 8.253.48.245
Connecting to img2.rtve.es|8.252.16.124|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87358 (85K) [image/jpeg]
Saving to: ‘3195388?w=1600&preview=1435846554460.jpg’
3195388?w=1600&prev 100%[===================>] 85.31K 552KB/s in 0.2s
2018-05-18 16:29:56 (552 KB/s) - ‘3195388?w=1600&preview=1435846554460.jpg’ saved [87358/87358]
--2018-05-18 16:29:56-- https://images.vexels.com/media/users/3/127855/isolated/preview/c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png
Resolving images.vexels.com... 177.54.152.45
Connecting to images.vexels.com|177.54.152.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9957 (9.7K) [image/png]
Saving to: ‘c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png’
c3f01cf799e4c8714a8 100%[===================>] 9.72K --.-KB/s in 0s
2018-05-18 16:29:56 (69.8 MB/s) - ‘c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png’ saved [9957/9957]
--2018-05-18 16:29:56-- https://upload.wikimedia.org/wikipedia/commons/2/2c/Rotating_earth_%28large%29.gif
Resolving upload.wikimedia.org... 208.80.154.240
Connecting to upload.wikimedia.org|208.80.154.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1429302 (1.4M) [image/gif]
Saving to: ‘Rotating_earth_(large).gif’
Rotating_earth_(lar 100%[===================>] 1.36M 1.00MB/s in 1.4s
2018-05-18 16:29:58 (1.00 MB/s) - ‘Rotating_earth_(large).gif’ saved [1429302/1429302]
FINISHED --2018-05-18 16:29:58--
Total wall clock time: 4.1s
Downloaded: 4 files, 1.5M in 1.5s (978 KB/s)
并排除扩展名wget -i file.txt --reject gif
。结果没有错误
MacBook-Pro:test tomillo$ wget -i file.txt --reject gif
--2018-05-18 16:34:28-- https://feedly.com/
Resolving feedly.com... 104.20.59.241, 104.20.60.241
Connecting to feedly.com|104.20.59.241|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘index.html’
index.html [ <=> ] 15.45K --.-KB/s in 0.04s
2018-05-18 16:34:30 (429 KB/s) - ‘index.html’ saved [15821]
--2018-05-18 16:34:30-- http://img2.rtve.es/v/3195388?w=1600&preview=1435846554460.jpg
Resolving img2.rtve.es... 8.252.16.124, 8.253.165.245, 8.253.149.117
Connecting to img2.rtve.es|8.252.16.124|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87358 (85K) [image/jpeg]
Saving to: ‘3195388?w=1600&preview=1435846554460.jpg’
3195388?w=1600&prev 100%[===================>] 85.31K 566KB/s in 0.2s
2018-05-18 16:34:30 (566 KB/s) - ‘3195388?w=1600&preview=1435846554460.jpg’ saved [87358/87358]
--2018-05-18 16:34:30-- https://images.vexels.com/media/users/3/127855/isolated/preview/c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png
Resolving images.vexels.com... 177.54.152.175
Connecting to images.vexels.com|177.54.152.175|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 9957 (9.7K) [image/png]
Saving to: ‘c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png’
c3f01cf799e4c8714a8 100%[===================>] 9.72K --.-KB/s in 0s
2018-05-18 16:34:30 (74.2 MB/s) - ‘c3f01cf799e4c8714a815fac05820bea-reloj-despertador-plana-verde-by-vexels.png’ saved [9957/9957]
--2018-05-18 16:34:30-- https://upload.wikimedia.org/wikipedia/commons/2/2c/Rotating_earth_%28large%29.gif
Resolving upload.wikimedia.org... 208.80.154.240
Connecting to upload.wikimedia.org|208.80.154.240|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1429302 (1.4M) [image/gif]
Saving to: ‘Rotating_earth_(large).gif’
Rotating_earth_(lar 100%[===================>] 1.36M 1024KB/s in 1.4s
2018-05-18 16:34:32 (1024 KB/s) - ‘Rotating_earth_(large).gif’ saved [1429302/1429302]
FINISHED --2018-05-18 16:34:32--
Total wall clock time: 3.9s
Downloaded: 4 files, 1.5M in 1.6s (972 KB/s)
问题出在哪里?
解决方案
推荐阅读
- libreoffice-writer - 如果样式为“文本正文”,则无法应用自动更正
- python - 如何选择元素中的文本,复制它,然后将其粘贴到页面的其他位置?Selenium webdriver python;
- reactjs - Next.js - Store 不适用于动态导入组件的子组件
- opencv - 使用 python opencv 库的自适应阈值错误
- object-detection-api - 使用 TensorFlow2 对象检测 API 检测各种纵横比对象
- c# - 使用 Visual Studio 2019 社区模拟 Linq 扩展方法是不可能的吗?
- r - ggplot axis.title 与axis.label在同一行/列
- postgresql - 如何通过重叠 jsonb 值来连接表?(PostgreSQL)
- elasticsearch - 如何在不更改我的应用程序代码的情况下加密/解密 Elasticsearch 中的字段
- c# - 从资源管理器拖放到 wpf 元素