首页 > 解决方案 > docker wordpress + nginx在没有标题的curl上返回空响应

问题描述

我在 docker 容器中有一个 wordpress+nginx,可以通过浏览器完美运行,但是当我尝试通过 curl 发送 http 请求时,响应始终为空

❯ curl -vv localhost:8080
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.64.1
> Accept: */*
>
* Empty reply from server
* Connection #0 to host localhost left intact
curl: (52) Empty reply from server
* Closing connection 0

如果我使用 -H 选项添加任何用户代理标头,它确实有效,但我希望它即使在标头中没有用户代理的情况下也能工作。

这是我的 nginx 设置:

worker_processes 1;
daemon off;

events {
    worker_connections 1024;
}

http {
    root /var/www/html;

    log_format  main  '$remote_addr - $remote_user [$time_local] "$request" '
                      '$status $body_bytes_sent "$http_referer" '
                      '"$http_user_agent" "$http_x_forwarded_for"';

    access_log /dev/stdout main;
    error_log /dev/stderr error;

    sendfile on;
    tcp_nopush on;
    tcp_nodelay on;
    #keepalive low (5seconds), should force hackers to re-connect.
    keepalive_timeout  5;
    fastcgi_intercept_errors on;
    fastcgi_buffers 16 16k; 
    fastcgi_buffer_size 32k;
    default_type       application/octet-stream;

    #php max upload limit cannot be larger than this
    client_max_body_size 40m;

    gzip on;
    gzip_disable "msie6";
    gzip_min_length 256;
    gzip_comp_level 4;
    gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript application/javascript image/svg+xml;

    limit_req_zone $remote_addr zone=loginauth:10m rate=15r/s;

    include  /etc/nginx/mime.types;
    include  /etc/nginx/nginx-server.conf;
}
server {
    listen 8080 default_server;
    server_name "localhost";

    access_log /dev/stdout main;
    error_log /dev/stdout error;

    # pass the PHP scripts to FastCGI
    location ~ \.php$ {
        include        fastcgi_params;
        fastcgi_pass   unix:/home/www-data/php-fpm.sock;
        fastcgi_index  index.php;
        fastcgi_param  DOCUMENT_ROOT $realpath_root;
        fastcgi_param  SCRIPT_FILENAME $document_root$fastcgi_script_name;
        fastcgi_intercept_errors on;
    }

    #Deny access to .htaccess, .htpasswd...
    location ~ /\.ht {
        deny  all;
    }


    location ~* .(jpg|jpeg|png|gif|ico|css|js|pdf|doc|docx|odt|rtf|ppt|pptx|xls|xlsx|txt)$ {
      expires max;
    }

    location = /favicon.ico {
        log_not_found off;
        access_log off;
    }

    location = /robots.txt {
        allow all;
        log_not_found off;
        access_log off;
    }

    #Block bad-bots
    if ($http_user_agent ~* (360Spider|80legs.com|Abonti|AcoonBot|Acunetix|adbeat_bot|AddThis.com|adidxbot|ADmantX|AhrefsBot|AngloINFO|Antelope|Applebot|BaiduSpider|BeetleBot|billigerbot|binlar|bitlybot|BlackWidow|BLP_bbot|BoardReader|Bolt\ 0|BOT\ for\ JCE|Bot\ mailto\:craftbot@yahoo\.com|casper|CazoodleBot|CCBot|checkprivacy|ChinaClaw|chromeframe|Clerkbot|Cliqzbot|clshttp|CommonCrawler|comodo|CPython|crawler4j|Crawlera|CRAZYWEBCRAWLER|Curious|Curl|Custo|CWS_proxy|Default\ Browser\ 0|diavol|DigExt|Digincore|DIIbot|discobot|DISCo|DoCoMo|DotBot|Download\ Demon|DTS.Agent|EasouSpider|eCatch|ecxi|EirGrabber|Elmer|EmailCollector|EmailSiphon|EmailWolf|Exabot|ExaleadCloudView|ExpertSearchSpider|ExpertSearch|Express\ WebPictures|ExtractorPro|extract|EyeNetIE|Ezooms|F2S|FastSeek|feedfinder|FeedlyBot|FHscan|finbot|Flamingo_SearchEngine|FlappyBot|FlashGet|flicky|Flipboard|g00g1e|Genieo|genieo|GetRight|GetWeb\!|GigablastOpenSource|GozaikBot|Go\!Zilla|Go\-Ahead\-Got\-It|GrabNet|grab|Grafula|GrapeshotCrawler|GTB5|GT\:\:WWW|Guzzle|harvest|heritrix|HMView|HomePageBot|HTTP\:\:Lite|HTTrack|HubSpot|ia_archiver|icarus6|IDBot|id\-search|IlseBot|Image\ Stripper|Image\ Sucker|Indigonet|Indy\ Library|integromedb|InterGET|InternetSeer\.com|Internet\ Ninja|IRLbot|ISC\ Systems\ iRc\ Search\ 2\.1|jakarta|Java|JetCar|JobdiggerSpider|JOC\ Web\ Spider|Jooblebot|kanagawa|KINGSpider|kmccrew|larbin|LeechFTP|libwww|Lingewoud|LinkChecker|linkdexbot|LinksCrawler|LinksManager\.com_bot|linkwalker|LinqiaRSSBot|LivelapBot|ltx71|LubbersBot|lwp\-trivial|Mail.RU_Bot|masscan|Mass\ Downloader|maverick|Maxthon$|Mediatoolkitbot|MegaIndex|MegaIndex|megaindex|MFC_Tear_Sample|Microsoft\ URL\ Control|microsoft\.url|MIDown\ tool|miner|Missigua\ Locator|Mister\ PiX|mj12bot|Mozilla.*Indy|Mozilla.*NEWT|MSFrontPage|msnbot|Navroad|NearSite|NetAnts|netEstate|NetSpider|NetZIP|Net\ Vampire|NextGenSearchBot|nutch|Octopus|Offline\ Explorer|Offline\ Navigator|OpenindexSpider|OpenWebSpider|OrangeBot|Owlin|PageGrabber|PagesInventory|panopta|panscient\.com|Papa\ Foto|pavuk|pcBrowser|PECL\:\:HTTP|PeoplePal|Photon|PHPCrawl|planetwork|PleaseCrawl|PNAMAIN.EXE|PodcastPartyBot|prijsbest|proximic|psbot|purebot|pycurl|QuerySeekerSpider|R6_CommentReader|R6_FeedFetcher|RealDownload|ReGet|Riddler|Rippers\ 0|rogerbot|RSSingBot|rv\:1.9.1|RyzeCrawler|SafeSearch|SBIder|Scrapy|Scrapy|Screaming|SeaMonkey$|search.goo.ne.jp|SearchmetricsBot|search_robot|SemrushBot|Semrush|SentiBot|SEOkicks|SeznamBot|ShowyouBot|SightupBot|SISTRIX|sitecheck\.internetseer\.com|siteexplorer.info|SiteSnagger|skygrid|Slackbot|Slurp|SmartDownload|Snoopy|Sogou|Sosospider|spaumbot|Steeler|sucker|SuperBot|Superfeedr|SuperHTTP|SurdotlyBot|Surfbot|tAkeOut|Teleport\ Pro|TinEye-bot|TinEye|Toata\ dragostea\ mea\ pentru\ diavola|Toplistbot|trendictionbot|TurnitinBot|turnit|Twitterbot|URI\:\:Fetch|urllib|Vagabondo|Vagabondo|vikspider|VoidEYE|VoilaBot|WBSearchBot|webalta|WebAuto|WebBandit|WebCollage|WebCopier|WebFetch|WebGo\ IS|WebLeacher|WebReaper|WebSauger|Website\ eXtractor|Website\ Quester|WebStripper|WebWhacker|WebZIP|Web\ Image\ Collector|Web\ Sucker|Wells\ Search\ II|WEP\ Search|WeSEE|Wget|Widow|WinInet|woobot|woopingbot|worldwebheritage.org|Wotbox|WPScan|WWWOFFLE|WWW\-Mechanize|Xaldon\ WebSpider|XoviBot|yacybot|Yahoo|YandexBot|Yandex|YisouSpider|zermelo|Zeus|zh-CN|ZmEu|ZumBot|ZyBorg) ) {
              return 444;
    }

    include /etc/nginx/nginx-locations.conf;
    include /var/www/nginx/locations/*;
}

# Deny all attempts to access hidden files such as .htaccess, .htpasswd, .DS_Store (Mac).
# Keep logging the requests to parse later (or to pass to firewall utilities such as fail2ban)
location ~ /\. {
    deny all;
}

# Deny access to any files with a .php extension in the uploads directory for the single site
location ~ ^/wp-content/uploads/.*\.php$ {
    deny all;
}

#Deny access to wp-content folders for suspicious files
location ~* ^/(wp-content)/(.*?)\.(zip|gz|tar|bzip2|7z)\$ { deny all; }
location ~ ^/wp-content/uploads/sucuri { deny all; }
location ~ ^/wp-content/updraft { deny all; }
location ~* ^/wp-content/uploads/.*.(html|htm|shtml|php|js|swf)$ {
        deny all;
}

# Block PHP files in includes directory.
location ~* /wp-includes/.*\.php\$ {
  deny all;
}

# Deny access to any files with a .php extension in the uploads directory
# Works in sub-directory installs and also in multisite network
# Keep logging the requests to parse later (or to pass to firewall utilities such as fail2ban)
location ~* /(?:uploads|files|wp-content|wp-includes)/.*\.php$ {
    deny all;
}

# Block nginx-help log from public viewing
location ~* /wp-content/uploads/nginx-helper/ { deny all; }

# Deny access to any files with a .php extension in the uploads directory
# Works in sub-directory installs and also in multisite network
location ~* /(?:uploads|files)/.*\.php\$ { deny all; }

# Deny access to uploads that aren’t images, videos, music, etc.
location ~* ^/wp-content/uploads/.*.(html|htm|shtml|php|js|swf|css)$ {
    deny all;
}



location / {
        # This is cool because no php is touched for static content.
        # include the "?$args" part so non-default permalinks doesn't break when using query string
                index  index.php index.html;
        try_files $uri $uri/ /index.php?$args;
}

# More ideas from:
# https://gist.github.com/ethanpil/1bfd01a817a8198369efec5c4cde6628

location ~* /(\.|wp-config\.php|wp-config\.txt|changelog\.txt|readme\.txt|readme\.html|license\.txt) { deny all; }

# Make sure files with the following extensions do not get loaded by nginx because nginx would display the source code, and these files can contain PASSWORDS!
location ~* \.(engine|inc|info|install|make|module|profile|test|po|sh|.*sql|theme|tpl(\.php)?|xtmpl)\$|^(\..*|Entries.*|Repository|Root|Tag|Template)\$|\.php_
{
    return 444;
}
#nocgi
location ~* \.(pl|cgi|py|sh|lua)\$ {
    return 444;
}
#disallow
location ~* (w00tw00t) {
    return 444;
}

我的目标是让服务器响应任何请求,即使它没有 User-agent 标头。

谢谢你的时间!

标签: wordpressdockernginxcurl

解决方案


这与 docker 或 wordpress 或其他东西无关。
拒绝请求只是您的 nginx 配置:

Curl在 http-agent 比较中有nginx-server.conf

    #Block bad-bots
    if ($http_user_agent ~* (...|Curl|...) ) {
              return 444;
    }

并且因为~*不区分大小写的匹配运算符,所以来自 curl 的每个请求都会在此处返回 444。

示例如何使用 grep 检查它:

$ echo 'curl/7.64.1' | grep -iPo '(...some...|Curl|...other...)'
curl

code444是一个特殊的nginx的非标准code,返回时会强制nginx立即关闭连接,不向客户端发送任何东西。这类似于连接拒绝(由对等方关闭)。

FWIW:(用于搜索请求未按预期处理的原因) - nginx 可以设置debug日志记录(例如,可以为某些端口侦听器设置以对其进行调试),因此错误日志将包含详细信息如何处理请求(触发哪些位置和重写规则,每个事件阶段的请求会发生什么情况,以及最终将向客户端提供什么响应和错误代码)。


推荐阅读