java - 如果一个处理程序陷入无限循环,Netty 不会处理所有传入的请求
问题描述
我们在应用程序中遇到了一个错误,在处理我们的协议期间,一个处理程序进入了无限循环并卡在了 channelRead() 方法中。
然而,这开始导致其他(不是全部,而是一些)新连接在连接的某个地方也被卡住。没有可见的线程表明连接被卡住。这会逐渐增加已建立连接的数量,然后最终新连接无法连接并开始超时。
为什么 1 个线程卡在无限循环中的 channelRead 中会阻塞任何其他连接(大约有 32 个可用线程可用于处理)?我确认一旦线程继续,所有卡住的连接都会恢复。
我用这个简单的例子复制了这个行为:
应用服务器.java:
import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.ChannelFuture;
import io.netty.channel.ChannelOption;
import io.netty.channel.EventLoopGroup;
import io.netty.channel.epoll.EpollEventLoopGroup;
import io.netty.channel.epoll.EpollServerSocketChannel;
public class AppServer {
private static final int HTTP_PORT = 8080;
public void run() throws Exception {
EventLoopGroup bossGroup = new EpollEventLoopGroup();
EventLoopGroup workerGroup = new EpollEventLoopGroup();
try {
ServerBootstrap httpBootstrap = new ServerBootstrap();
httpBootstrap
.group(bossGroup, workerGroup)
.channel(EpollServerSocketChannel.class)
.childHandler(new ServerInitializer())
.option(ChannelOption.SO_BACKLOG, 512)
.childOption(ChannelOption.SO_KEEPALIVE, true);
// Bind and start to accept incoming connections.
ChannelFuture httpChannel = httpBootstrap.bind(HTTP_PORT).sync();
// Wait until the server socket is closed
httpChannel.channel().closeFuture().sync();
}
finally {
workerGroup.shutdownGracefully();
bossGroup.shutdownGracefully();
}
}
public static void main(String[] args) throws Exception {
new AppServer().run();
}
}
服务器处理程序.java
import io.netty.buffer.ByteBuf;
import io.netty.buffer.Unpooled;
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.SimpleChannelInboundHandler;
import io.netty.handler.codec.http.*;
import io.netty.util.CharsetUtil;
public class ServerHandler extends SimpleChannelInboundHandler<FullHttpRequest> {
public static int count = 0;
@Override
protected void channelRead0(ChannelHandlerContext ctx, FullHttpRequest msg) {
if (count == 0) {
count++;
while (true) {}
}
ByteBuf content = Unpooled.copiedBuffer("Hello World!", CharsetUtil.UTF_8);
FullHttpResponse response = new DefaultFullHttpResponse(HttpVersion.HTTP_1_1, HttpResponseStatus.OK, content);
response.headers().set(HttpHeaderNames.CONTENT_TYPE, "text/html");
response.headers().set(HttpHeaderNames.CONTENT_LENGTH, content.readableBytes());
ctx.write(response);
ctx.flush();
count++;
}
}
ServerInitializer.java:
import io.netty.channel.Channel;
import io.netty.channel.ChannelInitializer;
import io.netty.channel.ChannelPipeline;
import io.netty.handler.codec.http.HttpObjectAggregator;
import io.netty.handler.codec.http.HttpServerCodec;
public class ServerInitializer extends ChannelInitializer<Channel> {
@Override
protected void initChannel(Channel ch) {
ChannelPipeline pipeline = ch.pipeline();
pipeline.addLast(new HttpServerCodec());
pipeline.addLast(new HttpObjectAggregator(Integer.MAX_VALUE));
pipeline.addLast(new ServerHandler());
}
}
我只使用 xargs 和 curl 运行许多连接curl -v http://[ip]:8080
。
一段时间后,它会进入超时失败的状态:
nc -vz [ip] 8080
Ncat: Version 7.50 ( https://nmap.org/ncat )
Ncat: Connection timed out.
这没有显示在环回界面上。
如果没有线程被卡住,netty 不会在相同的测试中遇到这个问题。它正在处理所有请求并且没有连接卡住。
我也尝试过使用 Nio。结果相同。
Netty4.1
已建立的卡住连接:
netstat -n | grep 8080 | sed -E 's/[[:space:]]+/ /g' | cut -d' ' -f 6 | sort | uniq -c
14551 ESTABLISHED
6839 TIME_WAIT
卡住的连接如下所示:
curl -v http://localhost:8080
* Rebuilt URL to: http://localhost:8080/
* Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.61.1
> Accept: */*
>
连接正常时的 tcp 转储:
00:25:35.135929 IP 10.94.158.96.50192 > 10.200.154.102.8080: Flags [S], seq 648221383, win 65340, options [mss 1210,nop,wscale 8,nop,nop,sackOK], length 0
00:25:35.135993 IP 10.200.154.102.8080 > 10.94.158.96.50192: Flags [S.], seq 167219764, ack 648221384, win 35844, options [mss 8961,nop,nop,sackOK,nop,wscale 8], length 0
00:25:35.174362 IP 10.94.158.96.50192 > 10.200.154.102.8080: Flags [.], ack 1, win 515, length 0
00:25:35.176419 IP 10.94.158.96.50192 > 10.200.154.102.8080: Flags [P.], seq 1:84, ack 1, win 515, length 83
00:25:35.176440 IP 10.200.154.102.8080 > 10.94.158.96.50192: Flags [.], ack 84, win 140, length 0
00:25:35.177307 IP 10.200.154.102.8080 > 10.94.158.96.50192: Flags [P.], seq 1:77, ack 84, win 140, length 76
00:25:35.216609 IP 10.94.158.96.50192 > 10.200.154.102.8080: Flags [F.], seq 84, ack 77, win 514, length 0
00:25:35.216856 IP 10.200.154.102.8080 > 10.94.158.96.50192: Flags [F.], seq 77, ack 85, win 140, length 0
00:25:35.254134 IP 10.94.158.96.50192 > 10.200.154.102.8080: Flags [.], ack 78, win 514, length 0
连接卡住时的tcp转储:
00:25:38.409177 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [S], seq 8750522, win 65340, options [mss 1210,nop,wscale 8,nop,nop,sackOK], length 0
00:25:38.409254 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [S.], seq 1214051234, ack 8750523, win 35844, options [mss 8961,nop,nop,sackOK,nop,wscale 8], length 0
00:25:38.446641 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], ack 1, win 515, length 0
00:25:38.449108 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [P.], seq 1:84, ack 1, win 515, length 83
00:25:38.449141 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, length 0
00:25:39.535154 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:39.535211 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
00:25:40.641378 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:40.641404 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
00:25:41.741142 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:41.741199 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
00:25:42.844891 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:42.844947 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
00:25:44.035849 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:44.035869 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
00:25:45.135646 IP 10.94.158.96.50193 > 10.200.154.102.8080: Flags [.], seq 83:84, ack 1, win 515, length 1
00:25:45.135702 IP 10.200.154.102.8080 > 10.94.158.96.50193: Flags [.], ack 84, win 140, options [nop,nop,sack 1 {83:84}], length 0
... repeats ...
解决方案
这很有意义...... Netty 使用 an 的概念,EventLoop
这意味着它在循环中处理任务和 IO。这里重要的一点是,Netty 使用非阻塞 IO,这意味着它永远不会阻塞 IO,因此可以使用一个线程处理多个连接。这允许仅使用少量线程处理 1M+ 连接。也就是说,这也意味着如果您“阻塞”一个线程(基本上是通过无限循环进行的),您还将影响由该线程处理的所有其他连接。
推荐阅读
- php - 如何从 laravel 中的任何模型和控制器中获取计算变量?
- android - 自定义视图 onDraw() 在奥利奥上不刷新
- azure - azure 广告注册应用程序的自签名证书
- android - Glide、RecyclerView:Glide即使输入不同的数据也会一遍遍返回同一张图片
- jasper-reports - 使用参数在 Jasper 中动态更改宽度、高度
- python - 如何在没有tensorflow的纯python中生成tfrecord文件?
- angular - 角度应用程序在本地 tomcat 中运行,但在开发环境中出错
- c++ - Xcode 中的 Googletest 没有看到测试代码发生了变化
- sql - sql 按列分组到同一行而不连接
- django - 弹性搜索模型init()引发关键错误