首页 > 解决方案 > Rails 应用程序在负载过重时不断崩溃

问题描述

我正在测试我的应用程序,因为我预计下周的负载会很重。我目前正在使用 BlazeMeter 来模拟负载。我的服务器使用 m5.large EC2 实例托管在 AWS Elasticbeanstalk 和 RDS 上。我已经使用 Puma、Capistrano 和 Nginx 设置了我的应用程序。我的配置如下:

//nginx.conf

upstream app {
  server unix: ///home/deploy/apps/appname/shared/tmp/sockets/appname-puma.sock;
}

server {
  listen 80;
  server_name example.com;

  root / home / deploy / apps / appname / current / public;

  if ($http_x_forwarded_proto != "https") {
    return 301 https: //$server_name$request_uri;
  }
}

server {
  listen 443 default_server http2;

  root / home / deploy / apps / appname / current / public;

  try_files / system / maintenance.html $uri / index.html $uri $uri.html @app;

  access_log /
    var / log / nginx / access.log main;
  access_log /
    var / log / nginx / healthd / application.log.$year - $month - $day - $hour healthd;
  error_log /
    var / log / nginx / error.log debug;

  location / assets {
    alias /
      var / app / current / public / assets;
    gzip_static on;
    gzip on;
    expires max;
    add_header Cache - Control public;
  }

  location~ * \.(jpg | jpeg | gif | css | png | js | ico | svg | svgz) $ {
    gzip_static on;
    gzip on;
    expires max;
    add_header Cache - Control public;
  }

  location @app {
    proxy_pass http: //app; # match the name of upstream directive which is defined above
      proxy_set_header Host $http_host;
    proxy_set_header X - Real - IP $remote_addr;
    proxy_set_header X - Forwarded - For $proxy_add_x_forwarded_for;
    proxy_set_header X - Forwarded - Proto $scheme;
    proxy_set_header HTTP_CLIENT_IP $remote_addr;
  }

  error_page 500 502 503 504 / 500. html;
  location / 500. html {}
  error_page 404 / 404. html;
  location / 404. html {}
  error_page 422 / 422. html;
  location / 422. html {}
}
##config/deploy.rb

# config valid only for current version of Capistrano
lock "3.8.2"

set :application, "appname"
set :repo_url, #"censored"
set :user,            'deploy'
set :puma_threads,    [16, 64]
set :puma_workers,    2

set :pty, true
set :use_sudo,        false
set :stage,           :production
set :deploy_via,      :remote_cache
set :deploy_to,       "/home/#{fetch(:user)}/apps/#{fetch(:application)}"#{}"/var/app/current"
set :puma_bind,       "unix://#{shared_path}/tmp/sockets/#{fetch(:application)}-puma.sock"
set :puma_state,      "#{shared_path}/tmp/pids/puma.state"
set :puma_pid,        "#{shared_path}/tmp/pids/puma.pid"
set :puma_access_log, "#{release_path}/log/puma.error.log"
set :puma_error_log,  "#{release_path}/log/puma.access.log"
set :ssh_options,     { forward_agent: true, user: fetch(:user), keys: %w(~/.ssh/id_rsa.pub) }
set :puma_preload_app, true
set :puma_worker_timeout, nil
set :puma_init_active_record, true  # Change to false when not using ActiveRecord

append :linked_files, %w{config/database.yml}
append :linked_dirs, ".bundle", "tmp", "public/system"#,  %w{bin log tmp/pids tmp/cache tmp/store vendor/bundle public/system}

namespace :puma do
  desc 'Create Directories for Puma Pids and Socket'
  task :make_dirs do
    on roles(:app) do
      execute "mkdir #{shared_path}/tmp/sockets -p"
      execute "mkdir #{shared_path}/tmp/pids -p"
    end
  end

  before :start, :make_dirs
end

namespace :deploy do
  desc "Make sure local git is in sync with remote."
  task :check_revision do
    on roles(:app) do
      unless `git rev-parse HEAD` == `git rev-parse origin/master`
        puts "WARNING: HEAD is not the same as origin/master"
        puts "Run `git push` to sync changes."
        exit
      end
    end
  end

  desc 'Initial Deploy'
  task :initial do
    on roles(:app) do
      before 'deploy:restart', 'puma:start'
      invoke 'deploy'
    end
  end

  desc 'Restart application'
  task :restart do
    on roles(:app), in: :sequence, wait: 5 do
      invoke 'puma:restart'
    end
  end

  before :starting,     :check_revision
  after  :finishing,    :compile_assets
end

有很多测试失败了,错误率很高。根据我来自 elasticbeanstalk 的日志,

[错误] 2465#0: *2262 测试“/var/app/current/public/assets”存在失败(2:没有这样的文件或目录)

[crit] 2465#0: *2216 connect() to unix:///var/run/puma/my_app.sock 在连接到上游时失败(2:没有这样的文件或目录)

我是新手,我不知道为什么会这样!任何帮助表示赞赏!谢谢!

更新 1:我的网站502 Bad Gateway Nginx在达到一定数量的模拟用户后也会显示。

更新 2: 就像 Myst 指出的那样,我也在使用数据库。

default: &default
    adapter: sqlite3
    pool: 5
    timeout: 5000

development:
    <<: *default
    database: db/development.sqlite3

test:
    <<: *default
    adapter: mysql2
    encoding: utf8
    database: <%= ENV['RDS_DB_NAME'] %>
    username: <%= ENV['RDS_USERNAME'] %>  
    password: <%= ENV['RDS_PASSWORD'] %>
    host: <%= ENV['RDS_HOSTNAME'] %> 
    port: <%= ENV['RDS_PORT'] %>

production:
    <<: *default
    adapter: mysql2
    encoding: utf8
    database: <%= ENV['RDS_DB_NAME'] %>  
    username: <%= ENV['RDS_USERNAME'] %>
    password: <%= ENV['RDS_PASSWORD'] %>
    host: <%= ENV['RDS_HOSTNAME'] %> 
    port: <%= ENV['RDS_PORT'] %>
    pool: 50 #20
    timeout: 10000

版本:capistrano3-puma (3.1.0) ruby​​ (2.3.0) rails (4.2.8) mysql2 (0.4.9)

标签: ruby-on-railsnginxamazon-ec2capistranopuma

解决方案


一般来说 - 在处理高负载时,首先测量机器指标是很好的 - 在稳定的情况下,就 RPS/同时客户端而言,它可以承受多少负载。并且在这个测试期间——LA、内存使用、IO 来确定当前的瓶颈是什么资源。

在 nginx ( ) 中设置上游连接限制,server unix:///... max_conns=20;默认情况下是无限的,在压力测试下这可能会导致工作人员膨胀并出现内存不足的错误。一旦工人死亡 - nginx 无法连接到套接字并报告错误 502。

查看您的puma_error_log(看起来它与配置中的访问日志混淆)是否有异常。

此外,由于您只有 2 个 cpu 内核 - 我怀疑是否需要 64 个线程乘以 2 个工作人员,除非您的大多数请求导致等待对外部 api 的调用。


推荐阅读