首页 > 解决方案 > 以某种方式传递URL字符串时无法建立Jsoup连接

问题描述

我有一个非常奇怪的问题。

我正在运行一个 spring 应用程序,我基本上只是生成一些线程,然后尝试建立与网站的连接以提取这些线程中响应的状态代码。没什么特别的,但我遇到了一个让我很困惑的问题。

我有以下代码

@Override
    public void run() {

        Document document;
        Connection.Response response;
        String link = "https://lu.vpbank.com/htm/752/de_LU/Stellenangebote.htm";
        System.out.println(link);
        System.out.println(this.site.getLink());

        //Is working fine
        try {
            response = Jsoup.connect(link).followRedirects(false).ignoreHttpErrors(true).execute();
            System.out.println(response.statusCode());
        } catch (IOException e) {
            e.printStackTrace();
        }

        //Is not working
        try {
            response = Jsoup.connect(this.site.getLink()).followRedirects(false).ignoreHttpErrors(true).execute();
            System.out.println(response.statusCode());
        } catch (IOException e) {
            e.printStackTrace();
        }
    }

事情是第一次尝试创建连接很好,因为我在方法中声明了字符串的内容。

在第二次尝试中,我从之前创建的对象中获取 URL 字符串,并从数据库中获取 url。这会引发错误......

控制台输出为:

https://lu.vpbank.com/htm/752/de_LU/Stellenangebote.htm
404
https://www.vpbank.lu/htm/752/de_LU/Stellenangebote.htm
javax.net.ssl.SSLException: Connection reset
    at java.base/sun.security.ssl.Alert.createSSLException(Alert.java:127)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:369)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:312)
    at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:307)
    at java.base/sun.security.ssl.SSLSocketImpl.handleException(SSLSocketImpl.java:1680)
    at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1054)
    at java.base/java.io.BufferedInputStream.fill(BufferedInputStream.java:244)
    at java.base/java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
    at java.base/java.io.BufferedInputStream.read(BufferedInputStream.java:343)
    at java.base/sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:754)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:689)
    at java.base/sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:713)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1623)
    at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1528)
    at java.base/java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:527)
    at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getResponseCode(HttpsURLConnectionImpl.java:308)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:736)
    at org.jsoup.helper.HttpConnection$Response.execute(HttpConnection.java:707)
    at org.jsoup.helper.HttpConnection.execute(HttpConnection.java:297)
    at net.candidatis.tierone.crawls.careersite.CrawlableCareerBasic.run(CrawlableCareerBasic.java:48)
    at net.candidatis.tierone.controllers.TestController.testCrawl(TestController.java:32)
    at net.candidatis.tierone.TieroneApplication.run(TieroneApplication.java:36)
    at org.springframework.boot.SpringApplication.callRunner(SpringApplication.java:804)
    at org.springframework.boot.SpringApplication.callRunners(SpringApplication.java:788)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:333)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1309)
    at org.springframework.boot.SpringApplication.run(SpringApplication.java:1298)
    at net.candidatis.tierone.TieroneApplication.main(TieroneApplication.java:27)
    Suppressed: java.net.SocketException: Broken pipe
        at java.base/sun.nio.ch.NioSocketImpl.implWrite(NioSocketImpl.java:420)
        at java.base/sun.nio.ch.NioSocketImpl.write(NioSocketImpl.java:440)
        at java.base/sun.nio.ch.NioSocketImpl$2.write(NioSocketImpl.java:826)
        at java.base/java.net.Socket$SocketOutputStream.write(Socket.java:1051)
        at java.base/sun.security.ssl.SSLSocketOutputRecord.encodeAlert(SSLSocketOutputRecord.java:82)
        at java.base/sun.security.ssl.TransportContext.fatal(TransportContext.java:400)
        ... 26 more
Caused by: java.net.SocketException: Connection reset
    at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:323)
    at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
    at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
    at java.base/java.net.Socket$SocketInputStream.read(Socket.java:981)
    at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
    at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
    at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
    at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1434)
    at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1038)
    ... 22 more

正如我们在控制台输出中看到的,url 是相同的。

site 只是我在启动线程之前创建的一个简单对象。

import lombok.Data;

@Data
public class Site {
    private final String link;
}

任何人都知道这个错误的原因可能是什么?

标签: javamultithreadingsocketsssljsoup

解决方案


正如我们在控制台输出中看到的,URL 是相同的。

但是,它们并不相同。他们有不同的主机名,这就是为什么你会得到不同的行为:

https://lu.vpbank.com /htm/752/de_LU/Stellenangebote.htm https://www.vpbank.lu /htm/752/de_LU/Stellenangebote.htm _

在浏览器中,第二个重定向到第一个。我猜它有不同的 TLS 设置,或者可能以不同的方式验证连接(有一些必需的标头?),这就是你收到连接重置错误的原因。但是,这是一个不同的问题。

(顺便说一句,感谢您提供足够的详细信息,包括打印您尝试访问的实际 URL - 让您以全新的眼光轻松帮助!)


推荐阅读