首页 > 解决方案 > 磁盘满后的 RabbitMQ 恢复队列

问题描述

我们的一个 rabbitmq 持久队列变得太大并填满了整个磁盘 (>2 TB)。RabbitMQ 崩溃了。在腾出一些空间之后,rabbitmq 重新启动但不是使用的 vhost /。下面的日志似乎表明它无法找到的进程存在问题(:noproc在日志中)。

这是一个生产服务器,一些队列包含重要数据。我怎样才能恢复至少一些队列?

CLI 日志:

$ rabbitmqctl restart_vhost --vhost /
Trying to restart vhost '/' on node 'rabbit@inoopa-storage' ...
Error:
Failed to start vhost '/' on node 'rabbit@inoopa-storage'Reason: {:shutdown, {:failed_to_start_child, :rabbit_vhost_process, {:error, {{:noproc, {:gen_server2, :call, [#PID<10613.14642.53>, :out, :infinity]}}, {:child, :undefined, :msg_store_persistent, {:rabbit_msg_store, :start_link, [:msg_store_persistent, '/home/rabbitmq/mnesia/rabbit@inoopa-storage/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L', [], {#Function<2.23124100/1 in :rabbit_queue_index>, {:start, [{:resource, "/", :queue, "workers_output_vector"}, {:resource, "/", :queue, "workers_scraped_keywords"}, {:resource, "/", :queue, "workers_input_oxylabs"}, {:resource, "/", :queue, "dev_ecommerce_detector_scrapped"}, {:resource, "/", :queue, "ecommerce_detector_input"}, {:resource, "/", :queue, "manual_input_facebook_not_priority"}, {:resource, "/", :queue, "manual_input_facebook"}, {:resource, "/", :queue, "manual_input_websitefinder"}, {:resource, "/", :queue, "dev_ecommerce_detector_output"}, {:resource, "/", :queue, "treatment_monitor"}, {:resource, "/", :queue, "workers_filtered_keywords"}, {:resource, "/", :queue, "dev2"}, {:resource, "/", :queue, "elastic_worker_tmp"}, {:resource, "/", :queue, "workers_input_scrapper"}, {:resource, "/", :queue, "workers_input_oxylabs_problematic"}, {:resource, "/", :queue, "best_phones_updater_input"}, {:resource, "/", :queue, "best_sector_updater_input"}, {:resource, "/", :queue, "workers_input_oxylabs_not_priority"}, {:resource, "/", :queue, "dev3"}, {:resource, "/", :queue, "workers_input_keywords"}, {:resource, "/", :queue, "MoveQueue.py"}, {:resource, "/", :queue, "manual_output_websitefinder"}, {:resource, "/", :queue, ...}, {:resource, "/", ...}, {:resource, ...}, {...}, ...]}}]}, :transient, 30000, :worker, [:rabbit_msg_store]}}}}}

消息存储显然已关闭:

$rabbitmqctl purge_queue dev
Purging queue 'dev' in vhost '/' ...
Error:
not_found

# From logs

2021-06-22 16:56:09.259 [info] <0.2025.0> Message store for directory '/home/rabbitmq/mnesia/rabbit@inoopa-storage/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L/msg_store_transient' is stopped

重新启动 vhost 时的完整服务器日志:

021-06-22 16:48:12.293 [info] <0.1471.0> Making sure data directory '/home/rabbitmq/mnesia/rabbit@inoopa-storage/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L' for vhost '/' exists
2021-06-22 16:48:12.298 [info] <0.1471.0> Starting message stores for vhost '/'
2021-06-22 16:48:12.298 [info] <0.1475.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_transient": using rabbit_msg_store_ets_index to provide index
2021-06-22 16:48:12.300 [info] <0.1471.0> Started message store of type transient for vhost '/'
2021-06-22 16:48:12.300 [info] <0.1478.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": using rabbit_msg_store_ets_index to provide index
2021-06-22 16:48:15.320 [warning] <0.1478.0> Message store "628WB79CIFDYO9LJI6DKMI09L/msg_store_persistent": rebuilding indices from scratch
2021-06-22 16:48:15.700 [error] <0.885.0> ** Generic server <0.885.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {function_clause,[{rabbit_queue_index,parse_segment_entries,[<<"G">>,false,{{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,100,0,173,0,0,0,0,0,0,0,0,0,0,0,40>>,<<131,... 96,240,167,250,99,100,0,4,116,114,117,101>>},no_del,no_ack},{{true,<<255,95,42,151,38,17,155,140,20,36,169,143,228,131,83,236,0,0,0,0,0,0,0,0,0,0,0,30>>,<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,115,115,97,103,101,104,4,100,0,8,114,101,...>>},...},...},...},...},...},...}},...}],...},...]}
2021-06-22 16:48:15.706 [error] <0.885.0> CRASH REPORT Process <0.885.0> with 8 neighbours exited with reason: no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.709 [error] <0.350.0> Supervisor worker_pool_sup had child 5 started with worker_pool_worker:start_link(worker_pool) at <0.885.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.713 [error] <0.350.0> Supervisor worker_pool_sup had child 1 started with worker_pool_worker:start_link(worker_pool) at <0.882.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.716 [error] <0.350.0> Supervisor worker_pool_sup had child 4 started with worker_pool_worker:start_link(worker_pool) at <0.881.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.718 [error] <0.350.0> Supervisor worker_pool_sup had child 6 started with worker_pool_worker:start_link(worker_pool) at <0.884.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.719 [error] <0.1471.0> Failed to start message store of type msg_store_persistent for vhost '/': {{{function_clause,[{rabbit_queue_index,parse_segment_entries,[<<"G">>,false,{{array,16384,0,undefined,{{{{{{{true,<<131...40,167,250,99,100,0,4,116,114,117,101>>},no_del,no_ack},{{true,<<255,95,42,151,38,17,155,140,20,36,169,143,228,131,83,236,0,0,0,0,0,0,0,0,0,0,0,30>>,<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,115,115,97,103,101,104,4,100,0,8,114,101,115,111,117,114,99,101,109,0,0,0,1,47,100,0,8,101,120,99,104,97,110,103,101,109,0,0,0,0,108,0,0,0,1,109,0,0,0,24,101,99,111,109,109,101,114,99,101,95,100,101,116,101,99,116,111,...>>},...},...},...},...},...},...}},...}],...},...]},...},...}
2021-06-22 16:48:15.721 [error] <0.350.0> Supervisor worker_pool_sup had child 3 started with worker_pool_worker:start_link(worker_pool) at <0.883.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.751 [error] <0.1478.0> CRASH REPORT Process <0.1478.0> with 1 neighbours exited with reason: {{function_clause,[{rabbit_queue_index,parse_segment_entries,[<<"G">>,false,{{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,100,0,173,0,0,0,0,0,0,0,0,0,0,0,40>>,<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,115,115,97,103,101,104,4,100,0,8,114,101,115,111,117,114,99,101,109,0,0,0,1,47,100,0,8,101,120,99,104,97,110,103,101,109,0,0,0,0,108,0,0,0,1,109,0,0,0,24,101,99,111,109,109,101,114,99,101,95,100,101,116,101,99,116,111,114,95,105,110,112,...>>},...},...},...},...},...},...}},...}],...},...]},...} in gen_server2:init_it/6 line 608
2021-06-22 16:48:15.752 [error] <0.1491.0> ** Generic server <0.1491.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.752 [error] <0.1492.0> ** Generic server <0.1492.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.753 [error] <0.1489.0> ** Generic server <0.1489.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.753 [error] <0.1493.0> ** Generic server <0.1493.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.753 [error] <0.1490.0> ** Generic server <0.1490.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.772 [error] <0.350.0> Supervisor worker_pool_sup had child 7 started with worker_pool_worker:start_link(worker_pool) at <0.887.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.772 [error] <0.1491.0> CRASH REPORT Process <0.1491.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.773 [error] <0.1492.0> CRASH REPORT Process <0.1492.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.773 [error] <0.1489.0> CRASH REPORT Process <0.1489.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.773 [error] <0.1493.0> CRASH REPORT Process <0.1493.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.773 [error] <0.1490.0> CRASH REPORT Process <0.1490.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.773 [error] <0.1494.0> ** Generic server <0.1494.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.773 [error] <0.1494.0> CRASH REPORT Process <0.1494.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.799 [error] <0.350.0> Supervisor worker_pool_sup had child 8 started with worker_pool_worker:start_link(worker_pool) at <0.886.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.799 [error] <0.1495.0> ** Generic server <0.1495.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.799 [error] <0.1495.0> CRASH REPORT Process <0.1495.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.800 [error] <0.350.0> Supervisor worker_pool_sup had child 2 started with worker_pool_worker:start_link(worker_pool) at <0.888.0> exit with reason no function clause matching rabbit_queue_index:parse_segment_entries(<<"G">>, false, {{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,...>>,...},...},...},...},...},...},...}},...}) line 1133 in context child_terminated
2021-06-22 16:48:15.800 [error] <0.350.0> Supervisor worker_pool_sup had child 4 started with worker_pool_worker:start_link(worker_pool) at <0.1491.0> exit with reason no such process or port in call to erlang:link(<0.1487.0>) in context child_terminated
2021-06-22 16:48:15.800 [error] <0.1496.0> ** Generic server <0.1496.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.800 [error] <0.1496.0> CRASH REPORT Process <0.1496.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.800 [error] <0.350.0> Supervisor worker_pool_sup had child 6 started with worker_pool_worker:start_link(worker_pool) at <0.1492.0> exit with reason no such process or port in call to erlang:link(<0.1487.0>) in context child_terminated
2021-06-22 16:48:15.800 [error] <0.1497.0> ** Generic server <0.1497.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.801 [error] <0.1497.0> CRASH REPORT Process <0.1497.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
2021-06-22 16:48:15.801 [error] <0.350.0> Supervisor worker_pool_sup had child 5 started with worker_pool_worker:start_link(worker_pool) at <0.1489.0> exit with reason no such process or port in call to erlang:link(<0.1487.0>) in context child_terminated
2021-06-22 16:48:15.801 [error] <0.1498.0> ** Generic server <0.1498.0> terminating
** Last message in was {'$gen_cast',{submit_async,#Fun<rabbit_queue_index.40.23124100>}}
** When Server state == undefined
** Reason for termination == 
** {noproc,[{erlang,link,[<0.1487.0>],[]},{rabbit_queue_index,'-queue_index_walker/1-fun-1-',2,[{file,"src/rabbit_queue_index.erl"},{line,709}]},{worker_pool_worker,handle_cast,2,[{file,"src/worker_pool_worker.erl"},{line,122}]},{gen_server2,handle_msg,2,[{file,"src/gen_server2.erl"},{line,1067}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,249}]}]}
2021-06-22 16:48:15.801 [error] <0.1498.0> CRASH REPORT Process <0.1498.0> with 0 neighbours exited with reason: no such process or port in call to erlang:link(<0.1487.0>) in gen_server2:terminate/3 line 1183
...
2021-06-22 16:48:15.811 [error] <0.1469.0> Supervisor {<0.1469.0>,rabbit_vhost_sup_wrapper} had child rabbit_vhost_process started with rabbit_vhost_process:start_link(<<"/">>) at undefined exit with reason {error,{{{function_clause,[{rabbit_queue_index,parse_segment_entries,[<<"G">>,false,{{array,16384,0,undefined,{{{{{{{true,<<131,221,139,210,241,243,169,169,198,3,109,103,34,100,0,173,0,0,0,0,0,0,0,0,0,0,0,40>>,<<131,104,6,100,0,13,98,97,115,105,99,95,109,101,115,115,97,103,101,104,4,100,0,8,114,101,115,111,117,114,99,101,109,0,0,0,1,47,100,0,8,101,120,99,104,97,110,103,101,109,0,0,0,0,108,0,0,0,1,109,0,0,0,24,101,99,111,109,109,101,114,99,101,95,100,101,116,101,99,116,111,114,95,105,...>>},...},...},...},...},...},...}},...}],...},...]},...},...}} in context start_error
2021-06-22 16:48:15.906 [info] <0.1475.0> Message store for directory '/home/rabbitmq/mnesia/rabbit@inoopa-storage/msg_stores/vhosts/628WB79CIFDYO9LJI6DKMI09L/msg_store_transient' is stopped

标签: rabbitmqmessage-queuerabbitmqctl

解决方案


推荐阅读