azure - System.Net.Http.HttpRequestException 从 Azure Datalake V2 下载多个文件
问题描述
我正在从 Azure Datalake V2 下载大量文件 >1000,并且我不断收到异常:
The SSL connection could not be established, see inner exception.
<--- Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
<--- An existing connection was forcibly closed by the remote host.
堆栈跟踪:
System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
---> System.IO.IOException: Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host..
---> System.Net.Sockets.SocketException (10054): An existing connection was forcibly closed by the remote host.
--- End of inner exception stack trace ---
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.ThrowException(SocketError error, CancellationToken cancellationToken)
at System.Net.Sockets.Socket.AwaitableSocketAsyncEventArgs.GetResult(Int16 token)
at System.Net.FixedSizeReader.ReadPacketAsync(Stream transport, AsyncProtocolRequest request)
at System.Net.Security.SslStream.EndProcessAuthentication(IAsyncResult result)
at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
--- End of stack trace from previous location where exception was thrown ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
--- End of inner exception stack trace ---
at System.Net.Http.ConnectHelper.EstablishSslConnectionAsyncCore(Stream stream, SslClientAuthenticationOptions sslOptions, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean allowHttp2, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.CreateHttp11ConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.GetHttpConnectionAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.HttpConnectionPool.SendWithRetryAsync(HttpRequestMessage request, Boolean doRequestAuth, CancellationToken cancellationToken)
at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
at System.Net.Http.DiagnosticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
编码:
var downloadTasks = job.Files.AsParallel().Select(x => Download(x));
await Task.WhenAll(downloadTasks);
private async Task Download(DownloadableFile file)
{
try
{
var options = new BlobRequestOptions
{
ParallelOperationThreadCount = 8,
DisableContentMD5Validation = true,
StoreBlobContentMD5 = false
};
var xzBlob = await _cloudBlobFileService.GetBlockBlobReference(file.FilePath);
await xzBlob.DownloadToFileAsync(file.LocalFilePath, FileMode.Create, null, options, null);
}
catch (Exception e)
{
_log.LogCritical(e, "Error downloading " + file.FilePath);
}
}
我还添加了这个:
ServicePointManager.DefaultConnectionLimit = Environment.ProcessorCount * 8;
ServicePointManager.Expect100Continue = false;
使用 .Net 核心 3.1 和 WindowsAzure.Storage 9.3.3
到 webjob 中的 program.cs 主要方法
我们曾经有一个没有 datalake 的 blobstorage 配置,但在切换到 datalake 后,这种情况出现了。它不会对应用程序产生太大影响,因为稍后会重试跳过的下载。但是,很高兴知道是什么原因造成的。
解决方案
您可以先尝试 11 月正式发布的新存储 SDK,但我不能保证这会解决问题。这是一个完整的重写
虽然仅从错误消息中无法准确定位,但有几件事需要注意:
- 网络错误。这是迄今为止最可能的原因,尽管有趣的是它与您的旧 blob 存储帐户一致。增加超时可能会降低网络错误的频率,重试逻辑将有助于克服它们。
- 不推荐使用无限并行。
ParallelOperationThreadCount
用于上传而不是下载,因此在这种情况下它不会限制请求。.NET 中服务器端连接的默认限制是 10 ,建议在使用 .NET Core 时增加此限制,这是需要考虑的问题。如果您同时访问同一个 blob 或分区的次数过多,您可能会开始遇到存储中的并发连接限制。
推荐阅读
- python - 如何在 Python 3 中使用请求绕过单选按钮抓取数据?
- if-statement - 为什么我在 IF 语句上收到 FALSE 值而不是指定值?
- android - 视图寻呼机中的 API 调用
- linux - Linux 脚本,显示所有登录学生的姓名和他们的模拟号码登录数量
- react-native - 如何使用react-native-svg-charts中的工具提示标记触摸值的垂直和水平线获取x和y坐标值
- azure - 如何扩展我的 Log Analytics AzureDiagnostics 日志数据
- python - Bash 字符串变量在使用时会发生不可预知的变化
- javascript - 如何使 laravel 中的对象在 JavaScript laravel Blade 中使用?
- django - 子对象的外键分配
- java - 如何从客户端发送的服务器解密文件