c# - 使用 HtmlAgilityPack、嵌套列表和 Linq
问题描述
List<List<string>> table = playerDoc.DocumentNode
.SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[4]/table/tbody")
.Descendants("tr")
.Skip(1)
.Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
.ToList();
我有这个代码块,它从网站上的表格中收集所有正确的信息。我的问题是数据如下所示:
例如,我试图弄清楚如何在数据中搜索 2 个匹配的字符串,S16
并且Pre
能够设置一个名为CareerProperties
的类(如果需要,我可以发布一类道具)。我尝试了LINQ
语句的不同变体并使用foreach
循环,但要么抛出异常,要么得到表中的所有内容。
foreach
我正在尝试简化我的代码,因为使用with检索数据大约需要 3-4 秒xpaths
,当我测试该LINQ
语句时,它返回为 Elapsed: 00:00:00.0068306。
任何帮助将不胜感激,因为我仍在学习C#
等等。如果我需要发布示例网页或代码的任何其他部分,我会这样做。谢谢你。
编辑:
foreach (var careerStats in findCareerNode)
{
if (careerStats
.SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[1]").InnerText.Trim() != seasonId)
{
index++;
continue;
}
else if (careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[2]")
.InnerText.Trim() != "Reg")
{
index++;
continue;
}
var type = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[2]")
.InnerText;
var record = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[3]")
.InnerText;
var amr = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[4]")
.InnerText ?? "0.0";
var goals = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[5]")
.InnerText;
var assists = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[6]")
.InnerText;
var sot = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[7]")
.InnerText;
var shots = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[8]")
.InnerText;
var passC = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[9]")
.InnerText;
var passA = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[10]")
.InnerText;
var keypass = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[11]")
.InnerText;
var interceptions = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[12]")
.InnerText;
var tac = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[13]")
.InnerText;
var tacA = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[14]")
.InnerText;
var blk = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[15]")
.InnerText;
var rc = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[16]")
.InnerText;
var yc = careerStats
.SelectSingleNode(
$"//*[@id='lg_team_user_leagues-{leagueId}']/div[{div}]/table/tbody/tr[{index}]/td[17]")
.InnerText;
...
}
解决方案
要过滤职业统计表的数据,您可以使用 LINQ 方法Where
。然后过滤后的数据可用于CareerProperties
使用 LINQ 方法创建对象列表Select
。
以下是我们如何获得选定seasonId
和的职业统计数据Reg
:
// Now the return type is a List of CareerProperties.
List<CareerProperties> table = playerDoc.DocumentNode
.SelectSingleNode($"//*[@id='lg_team_user_leagues-{leagueId}']/div[4]/table/tbody")
.Descendants("tr")
.Skip(1)
// Up to here is your code. Here you select all rows from the table.
// Each row is presented as List<string>.
.Select(tr => tr.Elements("td").Select(td => td.InnerText.Trim()).ToList())
// Here we filter table rows by "seasonId" and "Reg".
.Where(tr => tr[0] == seasonId && tr[1] == "Reg")
// Here we create objects CareerProperties from filtered rows.
.Select(tr => new CareerProperties
{
Type = tr[2],
Record = tr[3],
Amr = tr[4],
Goals = tr[5]
Assists = tr[6],
// Fill other properties.
...
})
.ToList();
推荐阅读
- javascript - 我可以在渲染时使用捆绑的 Javascript 文件操作 popup.html DOM 吗?
- python - 使用烧瓶中sqlalchemy的create_engine创建的引擎连接()时出现内存错误
- asp.net-mvc - 为什么 .NET Core Web API 调用 AddMVC() 和 UseMVC()?
- jquery - 如何仅显示最近的 i 元素?
- javascript - 显示带有数据信息的标记弹出窗口
- python - Pandas Dataframe 的困难嵌套字典
- java - 如何管理自定义处理器所需的外部 jar 依赖项
- java - Base64 无法解析 (Java 1.7)
- sql - 使用 for 循环获取所有字符串值
- php - 上传多张图片返回 false