java - HtmlUnit 在循环中获取表格,但不是第二次通过
问题描述
我正在用 HtmlUnit 解析一个网页。这个网页有一堆我以编程方式设置的输入,然后单击提交按钮。这将在输入下方的同一页面上返回分析结果。
解析器在第一次通过循环时工作正常,但不是第二次。这是代码:
public void getPortfolioVisualizerData(List<String>symbols) throws Exception {
final WebClient webClient = new WebClient();
final HtmlPage page = webClient.getPage("https://www.portfoliovisualizer.com/backtest-portfolio#analysisResults");
HtmlForm form = page.getFirstByXPath("//form[@action='backtest-portfolio#analysisResults']");
//Time Period combobox
HtmlSelect select = (HtmlSelect) page.getElementById("timePeriod");
HtmlOption option = select.getOptionByValue("4");
select.setSelectedAttribute(option, true);
//Start Year combobox
select = (HtmlSelect) page.getElementById("startYear");
option = select.getOptionByValue("1985");
select.setSelectedAttribute(option, true);
//End Year combobox
select = (HtmlSelect) page.getElementById("endYear");
option = select.getOptionByValue("2018");
select.setSelectedAttribute(option, true);
//Initial Amount text input
HtmlTextInput textField = form.getInputByName("initialAmount");
textField.type("10000");
//Periodic Adjustment combobox
select = (HtmlSelect) page.getElementById("annualOperation");
option = select.getOptionByValue("0");
select.setSelectedAttribute(option, true);
//Rebalancing combobox
select = (HtmlSelect) page.getElementById("rebalanceType");
option = select.getOptionByValue("1");
select.setSelectedAttribute(option, true);
//Display Income combobox
select = (HtmlSelect) page.getElementById("showYield");
option = select.getOptionByValue("false");
select.setSelectedAttribute(option, true);
//Benchmark combobox
select = (HtmlSelect) page.getElementById("benchmark");
option = select.getOptionByValue("VFINX");
select.setSelectedAttribute(option, true);
//Allocation 1 text input
textField = form.getInputByName("allocation1_1");
textField.type("100");
HtmlSubmitInput button = (HtmlSubmitInput)page.getElementById("submitButton");
Data data = new Data();
for (String symbol:symbols) {
//Asset 1 text input
textField = form.getInputByName("symbol1");
textField.type(symbol);
// Now submit the form by clicking the Analyze Portfolios button and get back the second page.
HtmlPage page2 = button.click();
HtmlTable table = (HtmlTable) page2.getByXPath("//table[@class='table table-striped table-condensed']").get(1); //the second table on the page
int rowNum = 0;
for (HtmlTableRow row : table.getRows()) {
rowNum++;
if (rowNum==1) continue; //skip table header values
int colNum = 0;
for (HtmlTableCell cell : row.getCells()) {
colNum++;
if (rowNum==2) {
data.Symbol = symbol;
String val = cell.asText();
switch(colNum) {
case 4: data.CAGR = val.replace("%", ""); break;
case 5: data.StdDev = val.replace("%", ""); break;
case 6: data.BestYear = val.replace("%", ""); break;
case 7: data.WorstYear = val.replace("%", ""); break;
case 8: data.MaxDrawdown = val.replace("%", ""); break;
case 9: data.SharpRatio = val; break;
case 10: data.SortinoRatio = val; break;
case 11: data.CorrelationToUsMkt = val;
}
}
}
saveStock(data);
button = (HtmlSubmitInput)page2.getElementById("submitButton");
form = page2.getFirstByXPath("//form[@action='backtest-portfolio#analysisResults']");
}
}
它给了我一个 java.lang.IndexOutOfBoundsException: Index: 1, Size: 0 在这一行:
HtmlTable table = (HtmlTable) page2.getByXPath("//table[@class='table table-striped table-condensed']").get(1); //the second table on the page
感兴趣的表格是页面上的第二个表格,但错误似乎表明它在第二次通过循环时找不到任何表格。为什么不?如果我手动输入第二个符号,它会返回感兴趣的表。
解决方案
我认为您应该在从 XPath 获取表格之前和点击之后添加延迟。它可能会在加载第二页之前尝试。
推荐阅读
- database - 想要创建反应原生应用程序..需要一些关于 DB 选项的输入
- kubernetes - 为什么我的 ROS 节点在 k8s 中部署时无法通信?
- javascript - Node.js 中的 Web Audio API analyser.getByteFrequencyData 等效项
- node.js - 修剪依赖步骤后,heroku 构建失败。npm 错误!代码 ENOTEMPTY
- python - 通过 onetoone 模型字段属性过滤 django 查询集
- php - ALM 辛烷值缺陷 GET API
- javascript - 用于生成简码的线性同余生成器有哪些替代方案?
- c# - .Net Core C# 中的 HttpClient 批量并行请求问题
- javascript - 使用 Vue 进行购物模式
- r - R:函数有 3 个目标,但控制对象假设为 1