首页 > 解决方案 > bash - 按时间过滤文件中的 IP

问题描述

我需要完成程序“wana”以按时间过滤此 IP 日志(-a > after ,-b > before >> time from-to)以仅以指定时间日期时间格式显示行:YYYY-MM-DD HH:MM :SS 到参数 -a 和 -b

这是我的日志文件,我使用:https ://pajda.fit.vutbr.cz/ios/ios-19-1-logs/blob/master/ios-example.com.access.log > 测试日志:

2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:07:49:01 +0100] "GET / HTTP/1.1" 200 58266 "-" "curl/7.61.1"
2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:08:49:01 +0100] "GET / HTTP/1.1" 200 58341 "-" "curl/7.61.1"
2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:08:56:10 +0100] "POST /wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550822170.2184400558471679687500" "WordPress/4.9.9; https://ios-example.com"
40.77.167.115 - - [22/Feb/2019:08:56:10 +0100] "GET / HTTP/1.1" 301 3541 "-" "Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)"
147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
198.27.69.191 - - [22/Feb/2019:09:43:13 +0100] "GET / HTTP/1.1" 200 22311 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:24 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:42 +0100] "GET /?gf_page=upload HTTP/1.1" 200 22304 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:07 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:37 +0100] "GET /?up_auto_log=true HTTP/1.1" 200 22315 "-" "Mozilla/5.0 (Windows NT 6.1; rv:36.0) Gecko/20100101 Firefox/36.0"
198.27.69.191 - - [22/Feb/2019:09:44:54 +0100] "GET /wp-admin/ HTTP/1.1" 302 3711 "-" "Mozilla/5.0 (Windows NT 6.1; rv:36.0) Gecko/20100101 Firefox/36.0"
198.27.69.191 - - [22/Feb/2019:09:44:55 +0100] "GET /wp-login.php?redirect_to=https%3A%2F%2Fios-example.com%2Fwp-admin%2F&reauth=1 HTTP/1.1" 200 3656 "-" "Mozilla/5.0 (Windows NT 6.1; rv:36.0) Gecko/20100101 Firefox/36.0"
198.27.69.191 - - [22/Feb/2019:09:45:38 +0100] "GET / HTTP/1.1" 200 22311 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:09:49:01 +0100] "GET / HTTP/1.1" 200 58276 "-" "curl/7.61.1"
2001:67c:1220:808::93e5:8ad - - [22/Feb/2019:10:49:01 +0100] "POST /wp-cron.php?doing_wp_cron=1550828941.3725960254669189453125 HTTP/1.1" 200 3279 "https://ios-example.com/wp-cron.php?doing_wp_cron=1550828941.3725960254669189453125" "WordPress/4.9.9; https://ios-example.com"
2001:67c:1220:80c:d4:985a:df2c:d717 - - [22/Feb/2019:10:49:01 +0100] "GET / HTTP/1.1" 200 58241 "-" "curl/7.61.1"
66.249.66.49 - - [22/Feb/2019:10:49:08 +0100] "GET /robots.txt HTTP/1.1" 404 3798 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
66.249.66.45 - - [22/Feb/2019:10:49:08 +0100] "GET / HTTP/1.1" 200 22306 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.96 Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
82.202.69.253 - - [22/Feb/2019:11:26:58 +0100] "GET / HTTP/1.1" 200 22226 "-" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"
82.202.69.253 - - [22/Feb/2019:11:27:44 +0100] "GET /HNAP1/ HTTP/1.1" 404 3723 "http://ios-example.com/" "Mozilla/5.0 (Windows NT 5.1; rv:9.0.1) Gecko/20100101 Firefox/9.0.1"

程序wana(需要完成):

#!/bin/bash

cat $5 | # filter rows by time from $2 to $4

这就是我调用程序的方式

$ ./wana -a "2019-02-22 09:00:00" -b "2019-02-22 09:44:54" ios-example.com.access.log

我需要这个选定的输出到控制台:

147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
198.27.69.191 - - [22/Feb/2019:09:43:13 +0100] "GET / HTTP/1.1" 200 22311 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:24 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:42 +0100] "GET /?gf_page=upload HTTP/1.1" 200 22304 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:07 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:37 +0100] "GET /?up_auto_log=true HTTP/1.1" 200 22315 

标签: bashawktext-manipulation

解决方案


$ cat tst.sh
#!/bin/env bash

beg="$2"
end="$4"
file="$5"

awk -v beg="$beg" -v end="$end" '
    {
        split($4,t,/[[\/:]/)
        mthNr = (index("JanFebMarAprMayJunJulAugSepOctNovDec",t[3])+2)/3
        cur = sprintf("%04d-%02d-%02d %02d:%02d:%02d",t[4],mthNr,t[2],t[5],t[6],t[7])
    }
    (cur > beg) && (cur < end)
' "$file"

$ ./tst.sh -a '2019-02-22 09:00:00' -b '2019-02-22 09:44:54' file
147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
147.229.13.201 - - [22/Feb/2019:09:24:33 +0100] "-" 408 3275 "-" "-"
198.27.69.191 - - [22/Feb/2019:09:43:13 +0100] "GET / HTTP/1.1" 200 22311 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:24 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:43:42 +0100] "GET /?gf_page=upload HTTP/1.1" 200 22304 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:07 +0100] "GET / HTTP/1.1" 200 22313 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0"
198.27.69.191 - - [22/Feb/2019:09:44:37 +0100] "GET /?up_auto_log=true HTTP/1.1" 200 22315 "-" "Mozilla/5.0 (Windows NT 6.1; rv:36.0) Gecko/20100101 Firefox/36.0"

我希望你可以添加 getopts 循环或任何你喜欢的东西来真正从参数中填充变量。


推荐阅读