首页 > 解决方案 > C# HttpWebRequest.BeginGetResponse 块在一个类中,但不在另一个类中

问题描述

我正在使用 HttpWebRequest.BeginGetResponse 验证代理列表。它工作得非常好,我可以在几秒钟内验证数千个代理并且不会阻塞。

在我的项目中的另一个类中,我正在调用相同的代码并且它会阻塞。

代理验证方法(不阻止):

public void BeginTest(IProxyTest test, Action<ProxyStatus> callback, int timeout = 10000)
{
    var req = HttpWebRequest.Create(test.URL);
    req.Proxy = new WebProxy(this.ToString());
    req.Timeout = timeout;

    WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
    {
        if (callbackState.Exception != null)
        {
            callback(ProxyStatus.Invalid);
        }
        else
        {
            var responseStream = callbackState.ResponseStream;
            using (var reader = new StreamReader(responseStream))
            {
                var responseString = reader.ReadToEnd();
                if (responseString.Contains(test.Validation))
                {
                    callback(ProxyStatus.Valid);
                }
                else
                {
                    callback(ProxyStatus.Invalid);
                }
            }
        }
    }));
}

WebHelper.BeginGetResponse

public static void BeginGetResponse(WebRequest request, Action<RequestCallbackState> responseCallback)
{
    Task<WebResponse> asyncTask = Task.Factory.FromAsync<WebResponse>(request.BeginGetResponse, request.EndGetResponse, null);
    ThreadPool.RegisterWaitForSingleObject((asyncTask as IAsyncResult).AsyncWaitHandle, new WaitOrTimerCallback(TimeoutCallback), request, request.Timeout, true);
    asyncTask.ContinueWith(task =>
    {
        WebResponse response = task.Result;
        Stream responseStream = response.GetResponseStream();
        responseCallback(new RequestCallbackState(responseStream));
        responseStream.Close();
        response.Close();
    }, TaskContinuationOptions.NotOnFaulted);
    //Handle errors
    asyncTask.ContinueWith(task =>
    {
        var exception = task.Exception;
        responseCallback(new RequestCallbackState(exception.InnerException));
    }, TaskContinuationOptions.OnlyOnFaulted);
}

具有类似方法的其他类也调用 WebHelper.BeginGetResponse,但会阻塞(为什么?)

public void BeginTest(Action<ProxyStatus> callback, int timeout = 10000)
{
    var req = HttpWebRequest.Create(URL);
    req.Timeout = timeout;

    WebHelper.BeginGetResponse(req, new Action<RequestCallbackState>(callbackState =>
    {
        if (callbackState.Exception != null)
        {
            callback(ProxyStatus.Invalid);
        }
        else
        {
            var responseStream = callbackState.ResponseStream;
            using (var reader = new StreamReader(responseStream))
            {
                var responseString = reader.ReadToEnd();
                if (responseString.Contains(Validation))
                {
                    callback(ProxyStatus.Valid);
                }
                else
                {
                    callback(ProxyStatus.Invalid);
                }
            }
        }
    }));
}

调用阻塞的代码

private async void validateTestsButton_Click(object sender, EventArgs e)
{
    await Task.Run(() =>
    {
        foreach (var test in tests)
        {
            test.BeginTest((status) => test.Status = status);
        }
    });
}

调用不阻塞的代码:

public static async Task BeginTests(ICollection<Proxy> proxies, ICollection<ProxyJudge> judges, int timeout = 10000, IProgress<int> progress = null)
{
    await Task.Run(() =>
    {
        foreach (var proxy in proxies)
        {
            proxy.BeginTest(judges.GetRandomItem(), new Action<ProxyStatus>(status =>
            {
                proxy.Status = status;
            }), timeout);
        }
    });
}

标签: c#

解决方案


尽管这并不能完全解决您的问题,但它可能会对您有所帮助

这里有几个问题

  1. 您正在使用 APM(异步编程模型
  2. 你正在使用ThreadPool看起来有点过时的类
  3. 您正在执行IO 绑定工作并阻塞线程池上的线程
  4. 您正在使用 APM 和 TBA 异步模型的奇怪组合
  5. 并且似乎占用了等待 IO 的线程池

所以你在做IO 绑定的工作,你可能猜到的最好的模式是 TBAasync await模式。基本上,每次等待IO 完成端口时,您都希望将该线程返回给操作系统,并对您的系统友好,从而为需要的地方释放资源。

此外,您显然需要某种程度的并行性,并且您最好至少对其进行一些控制。

我建议这对于TPL 数据流ActionBlock来说是一项不错的工作

给定

public class Proxy
{
   public ProxyStatus ProxyStatus { get; set; }
   public string ProxyUrl { get; set; }
   public string WebbUrl { get; set; }
   public string Error { get; set; }
}

动作块示例

public static async Task DoWorkLoads(List<Proxy> results)
{
   var options = new ExecutionDataflowBlockOptions
                     {
                        MaxDegreeOfParallelism = 50
                     };

   var block = new ActionBlock<Proxy>(CheckUrlAsync, options);

   foreach (var proxy in results)
   {
      block.Post(proxy);
   }

   block.Complete();
   await block.Completion;
}

CheckUrlAsync 示例

// note i havent tested this, add pepper and salt to taste
public static async Task CheckUrlAsync(Proxy proxy)
{
   try
   {

      var request = WebRequest.Create(proxy.Url);

      if (proxy.ProxyUrl != null)
         request.Proxy = new WebProxy(proxy.ProxyUrl);

      using (var response = await request.GetResponseAsync())
      {
         using (var responseStream = response.GetResponseStream())
         {
            using (var reader = new StreamReader(responseStream))
            {
               var responseString = reader.ReadToEnd();
               if (responseString.Contains("asdasd"))
                  proxy.ProxyStatus = ProxyStatus.Valid;
               else
                  proxy.ProxyStatus = ProxyStatus.Invalid;
            }
         }
      }

   }
   catch (Exception e)
   {
      proxy.ProxyStatus = ProxyStatus.Error;
      proxy.Error = e.Message;
   }
}

用法

 await DoWorkLoads(proxies to test);

概括

代码更整洁,你不会到处乱扔动作,你使用async并且await你已经放弃了 APM,你可以控制并行度并且你对线程池很友好


推荐阅读