首页 > 解决方案 > Nettys FileDescriptor 在 OS X 上的使用情况如何

问题描述

在 PLC4X 项目中,我们使用 Netty 让客户端连接到充当服务器的 PLC。有时,由于用户错误或 PLC 错误,连接不被接受但被拒绝。如果我们多次重新尝试尽快建立连接,我们会遇到错误消息Too many open files。我尝试清理代码中的所有内容,因此我假设没有可能泄漏的文件描述符:

try {
  final NioEventLoopGroup workerGroup = new NioEventLoopGroup();

  Bootstrap bootstrap = new Bootstrap();
  bootstrap.group(workerGroup);
  bootstrap.channel(NioSocketChannel.class);
  bootstrap.option(ChannelOption.SO_KEEPALIVE, true);
  bootstrap.option(ChannelOption.TCP_NODELAY, true);
  // TODO we should use an explicit (configurable?) timeout here
  // bootstrap.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 1000);
  bootstrap.handler(channelHandler);
  // Start the client.
  final ChannelFuture f = bootstrap.connect(address, port);
  f.addListener(new GenericFutureListener<Future<? super Void>>() {
      @Override public void operationComplete(Future<? super Void> future) throws Exception {
          if (!future.isSuccess()) {
              logger.info("Unable to connect, shutting down worker thread.");
              workerGroup.shutdownGracefully();
          }
      }
  });
  // Wait for sync
  f.sync();
  f.awaitUninterruptibly(); // jf: unsure if we need that
  // Wait till the session is finished initializing.
  return f.channel();
} catch (InterruptedException e) {
  Thread.currentThread().interrupt();
  throw new PlcConnectionException("Error creating channel.", e);
} catch (Exception e) {
  throw new PlcConnectionException("Error creating channel.", e);
}

据我了解,侦听器应始终关闭组并释放所有使用的描述符。但实际上,在 macOS Catalina 上运行它时,我发现大约 1% 的失败不是由于“拒绝”,而是由于“打开的文件太多”。这是一ulimit件事吗,因为 Netty(在 macOS 上)只需要使用一些 fd 吗?还是我泄露了什么?

感谢您的澄清!

标签: javamacostcpnetty

解决方案


我找到了解决方案,有点像我自己。在原始实现中存在 2 个问题(甚至可能 3 个),它们与 Mac OS X 并不真正相关:

  • connect 和 addListener 应该被链接
  • workerGroup.shutdownGracefully()在另一个线程中触发,因此主(被调用)线程已经完成
  • 它没有等待workerGroup真正完成。

这一起可能会导致出现新组的生成速度快于旧组关闭的情况。因此,我将实现更改为

try {
    final NioEventLoopGroup workerGroup = new NioEventLoopGroup();

    Bootstrap bootstrap = new Bootstrap();
    bootstrap.group(workerGroup);
    bootstrap.channel(NioSocketChannel.class);
    bootstrap.option(ChannelOption.SO_KEEPALIVE, true);
    bootstrap.option(ChannelOption.TCP_NODELAY, true);
    // TODO we should use an explicit (configurable?) timeout here
    // bootstrap.option(ChannelOption.CONNECT_TIMEOUT_MILLIS, 1000);
    bootstrap.handler(channelHandler);
    // Start the client.
    logger.trace("Starting connection attempt on tcp layer to {}:{}", address.getHostAddress(), port);
    final ChannelFuture f = bootstrap.connect(address, port);
    // Wait for sync
    try {
        f.sync();
    } catch (Exception e) {
        // Shutdown worker group here and wait for it
        logger.info("Unable to connect, shutting down worker thread.");
        workerGroup.shutdownGracefully().awaitUninterruptibly();
        logger.debug("Worker Group is shutdown successfully.");
        throw new PlcConnectionException("Unable to Connect on TCP Layer to " + address.getHostAddress() + ":" + port, e);
    }
    // Wait till the session is finished initializing.
    return f.channel();
}
catch (Exception e) {
    throw new PlcConnectionException("Error creating channel.", e);
}

它解决了上述问题。因此,调用仅在正确清理后才结束。

我的测试现在显示了恒定数量的打开文件描述符。


推荐阅读