cookies - 如何使用 Jsoup 文档方法
问题描述
我是 Java 世界的新手和初学者。我有这个代码
public class Test2 {
public static void main(String[] args) throws IOException {
try {
String url = "http://www.metalbulletin.com/Login.html?ReturnURL=%2fdefault.aspx&";
String articleURL = "https://www.metalbulletin.com/Article/3838710/Home/CHINA-REBAR-Domestic-prices-recover-after-trading-pick-up.html";
Connection.Response loginForm = Jsoup.connect(url)
.method(Connection.Method.GET)
.execute();
Document welcomePage = loginForm.parse();
Element formElement = welcomePage.body().getElementsByTag("form").get(0);
String formAction = formElement.attr("action");
Elements input = welcomePage.select("input[name=idsrv.xsrf]");
String securityTokenValue =input.attr("value");
Connection.Response mainPage = Jsoup.connect("https://account.metalbulletin.com"+formAction)
.data("idsrv.xsrf", securityTokenValue)
.data("username", "ifiih@rupayamail.com")
.data("password", "Kh457544")
.cookies(loginForm.cookies())
.method(Connection.Method.POST)
.execute();
Map<String, String> cookies = mainPage.cookies();
System.out.println("\n\nloginForm.cookies()==>\n"+loginForm.cookies());
System.out.println("\n\nmainPage.cookies()==>\n"+mainPage.cookies());
Document articlePage = Jsoup.connect(articleURL).cookies(cookies).get();
Element article = articlePage.getElementById("article-body");
Elements lead1 = article.getElementsByClass("articleContainer");
System.out.println("\n\nNews Article==>\n"+lead1);
} catch (IOException e) {
e.printStackTrace();
}
}
}
我该如何重构:
private Map<String, String> cookies = new HashMap<String, String>();
private Document get(String url) throws IOException {
Connection connection = Jsoup.connect(url);
for (Map.Entry<String, String> cookie : cookies.entrySet()) {
connection.cookie(cookie.getKey(), cookie.getValue());
}
Response response = connection.execute();
cookies.putAll(response.cookies());
return response.parse();
}
我不确定如何调用此private Document get(String url)
方法。这似乎是一个愚蠢的问题,但对我来说非常重要。
我怎么能在同一个班级里称呼它?
解决方案
为此,检索文档和 Cookie 映射的最简单和更有效的解决方案是创建一个名为 TestThreadHandler 的新类,如下所示:
public class TestThreadHandler implements Runnable {
private String url;
private Document doc;
private Map<String, String> cookies;
private Semaphore barrier;
public TestThreadHandler (String url, Document doc, Map<String, String> cookies, Semaphore barrier) {
this.url = url;
this.doc = doc;
this.cookies = cookies;
this.barrier = barrier;
}
public void run () {
try {
Connection connection = Jsoup.connect(this.url);
for (Map.Entry<String, String> cookie : this.cookies.entrySet()) {
connection.cookie(cookie.getKey(), cookie.getValue());
}
Response response = connection.execute();
this.cookies.putAll(response.cookies());
this.doc = response.parse();
} catch (IOException e) {
e.printStackTrace();
}
this.barrier.release();
}
}
并从您想要调用它的任何地方从您的 Test2 类中调用该线程,但对该线程的示例调用将是:
public class Test2 {
public static void main(String[] args) throws IOException {
try {
...
String url = "https://www.google.com";
Document doc;
Map<String, String> cookies = new HashMap<String, String>();
Semaphore barrier = new Semaphore(0);
Thread taskThread = new Thread( new TestThreadHandler(url, doc, cookies, barrier) );
taskThread.start();
barrier.acquireUninterruptibly(1); // Wait until Thread ends
// NOW YOU HAVE BOTH DOC AND COOKIES FILLED AS DESCRIBED IN TestThreadHandler
...
} catch (IOException e) {
e.printStackTrace();
}
}
}
这样做可以覆盖作为参数传递给 Thread 的变量,并获取 Cookie 和 JSOUP 文档。
如需进一步解释,请查看 ThreadHandling 的 Java 文档或随时问我!
希望这对您有所帮助!+1
推荐阅读
- python - 如何使用 PRAW 获取最旧的提交
- javascript - 在 WebBrowser 控件中使用时 document.hidden 错误地为 false
- c++ - 如何从文件 .txt 设置动态数组大小
- json - Why is JSON_QUERY sending back a null value?
- angularjs - AngularJS - How to store JWT token in $localStorage
- angular - Angular .pipe and .subscribe undefined when selecting from ngrx store in unit tests
- servicebus - 如何使用 webjob sdk 的内置功能发送服务总线消息
- javascript - 使用箭头函数渲染道具、Apollo 和 JSX 道具
- mysql - Adventureworks 和 MSSQL 工作台
- java - 创建配置 MAVEN、CXF 和 spring 的示例应用程序