首页 > 解决方案 > ScrapySharp 表单提交导致 System.AggregateException

问题描述

我花了几个小时思考为什么这不起作用

我正在尝试使用 ScrapySharp 来抓取网站,现在只是尝试示例网站然后转移到我的实际网站。

每次我form.Submit()在我的程序中执行一次时,都会遇到 System.AggregateException(指定的演员表无效)

我的代码:

using System;
using System.IO;
using System.Linq;
using System.Net;
using HtmlAgilityPack;
using ScrapySharp.Extensions;
using ScrapySharp.Html;
using ScrapySharp.Html.Forms;
using ScrapySharp.Network;

namespace WebScraper
{
    class MainClass
    {
        public static void Main(string[] args)
        {
            ScrapingBrowser browser = new ScrapingBrowser();

            //set UseDefaultCookiesParser as false if a website returns invalid cookies format
            //browser.UseDefaultCookiesParser = false;
            browser.AllowAutoRedirect = true;
            browser.AllowMetaRedirect = true;
            WebPage homePage = browser.NavigateToPage(new Uri("http://the-internet.herokuapp.com/login"));

            PageWebForm form = homePage.FindForm("login");
            form["username"] = "tomsmith";
            form["password"] = "SuperSecretPassword!";
            form.Method = HttpVerb.Get; //I tried both .Post and .Get
            WebPage resultsPage = form.Submit(); //THIS IS WHERE I GET THE ERROR
            Console.WriteLine(resultsPage);

        }
    }
}

我的错误:

System.AggregateException:发生一个或多个错误。(指定的转换无效。)---> System.InvalidCastException:指定的转换无效。在 ScrapySharp.Network.ScrapingBrowser.CreateRequest (System.Uri url, ScrapySharp.Network.HttpVerb 动词) [0x0000b] in <0a639adc663f45108f057c429262c620>:0 在 ScrapySharp.Network.ScrapingBrowser.NavigateToPageAsync (System.Uri url, ScrapySharp.Network.HttpVerb 动词, System.String data, System.String contentType) [0x00066] in <0a639adc663f45108f057c429262c620>:0 --- 内部异常堆栈跟踪结束 --- 在 System.Threading.Tasks.Task.ThrowIfExceptional (System.Boolean includeTaskCanceledExceptions) [0x00011 ] 在 /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System.Private。1[TResult].GetResultCore (System.Boolean waitCompletionNotification) [0x0002b] in /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System.Private.CoreLib/src/System/Threading/Tasks/Future.cs:496 at System.Threading.Tasks.Task1[TResult].get_Result () [0x00000] 在 /Users/builder/jenkins/workspace/build-package-osx-mono/2019-06/external/bockbuild/builds/mono-x64/external/corert/src/System .Private.CoreLib/src/System/Threading/Tasks/Future.cs:466 在 ScrapySharp.Network.ScrapingBrowser.NavigateToPage (System.Uri url,ScrapySharp.Network.HttpVerb 动词,System.String 数据,System.String contentType)[ 0x0000b] in <0a639adc663f45108f057c429262c620>:0 在 ScrapySharp.Html.Forms.PageWebForm.Submit () [0x00023] in <0a639adc663f45108f057c429262c620>:0 在 WebScraper.MainClass/x [0006] inargsUserClass.Main (0006[] arib/Projects/WebScraper/WebScraper/Program.cs:29

我厌倦了这个错误,非常感谢任何和所有的帮助..谢谢

标签: c#.netweb-scrapingaggregateexceptionscrapysharp

解决方案


问题是当你使用时form["username"],结果是一个字符串。您想获得FormField,您可以使用以下代码执行此操作:

WebPage homePage = browser.NavigateToPage(new Uri("http://the-internet.herokuapp.com/login"));
PageWebForm form = homePage.FindForm("login");
var formFields = form.FormFields;
foreach (var field in formFields)
{
    if (field.Name.Equals("username", StringComparison.OrdinalIgnoreCase))
    {
        field.Value = "tomsmith";

    }
    else if (field.Name.Equals("password", StringComparison.OrdinalIgnoreCase))
    {
        field.Value = "SuperSecretPassword!";

    }
}

WebPage resultsPage = form.Submit();
Console.WriteLine(resultsPage);

或者,您可以使用Find()获取FormField

var usernameField = form.FormFields.Find(x => x.Name == "username");

推荐阅读