java - 将 JSoup 编译为 Maven 依赖项的麻烦 - Java
问题描述
我是一个新的 Java 开发人员。
我在使用 JSoup 时遇到了麻烦。几天前,我做了一个网络爬虫(唯一的目的是练习和学习新东西),它运行良好,由于我必须解决的代码错误,但至少它在控制台中运行良好。我的意思是,它没有给出预期的结果,而是运行。但现在我不明白发生了什么,我在编译方面遇到了麻烦。
这是输出中的错误:
Scanning for projects...
--------------< com.webcrawler.jsoupexample:jsoupexample >--------------
Building jsoupexample 1.0-SNAPSHOT
--------------------------------[ jar ]---------------------------------
--- maven-resources-plugin:2.6:resources (default-resources) @ jsoupexample ---
Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
Copying 0 resource
--- maven-compiler-plugin:3.1:compile (default-compile) @ jsoupexample ---
Changes detected - recompiling the module!
File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
Compiling 2 source files to C:\Users\**\**\**\**\**\web-crawler-jsoup-example-master\webcrawler\target\classes
-------------------------------------------------------------
COMPILATION ERROR :
-------------------------------------------------------------
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
symbol: class Document
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
symbol: variable Jsoup
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
symbol: class Elements
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
symbol: class Element
location: class com.webcrawler.jsoupexample.ParserEngine
8 errors
-------------------------------------------------------------
------------------------------------------------------------------------
BUILD FAILURE
------------------------------------------------------------------------
Total time: 3.465 s
Finished at: 2020-12-06T20:00:55-03:00
------------------------------------------------------------------------
Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project jsoupexample: Compilation failure: Compilation failure:
com/webcrawler/jsoupexample/ParserEngine.java:[3,17] package org.jsoup does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[4,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[5,23] package org.jsoup.nodes does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[6,24] package org.jsoup.select does not exist
com/webcrawler/jsoupexample/ParserEngine.java:[22,9] cannot find symbol
symbol: class Document
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[22,24] cannot find symbol
symbol: variable Jsoup
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[23,9] cannot find symbol
symbol: class Elements
location: class com.webcrawler.jsoupexample.ParserEngine
com/webcrawler/jsoupexample/ParserEngine.java:[25,14] cannot find symbol
symbol: class Element
location: class com.webcrawler.jsoupexample.ParserEngine
-> [Help 1]
To see the full stack trace of the errors, re-run Maven with the -e switch.
Re-run Maven using the -X switch to enable full debug logging.
For more information about the errors and possible solutions, please read the following articles:
[Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
这是我的pom.mxl
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<groupId>com.webcrawler.jsoupexample</groupId>
<artifactId>jsoupexample</artifactId>
<version>1.0-SNAPSHOT</version>
<dependencies>
<!-- https://mvnrepository.com/artifact/org.jsoup/jsoup -->
<dependency>
<groupId>org.jsoup</groupId>
<artifactId>jsoup</artifactId>
<version>1.11.3</version>
</dependency>
</dependencies>
</project>
这是ParserEngine.java
我在导入中有错误的类(在我没有这个错误之前)。
package com.webcrawler.jsoupexample;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements; //this 4 imports has error now, days ago didn't have error
import java.io.IOException;
import java.util.ArrayList;
public class ParserEngine {
private String baseUrl;
private ArrayList<String> urlList;
public ParserEngine(String baseUrl){
this.baseUrl = baseUrl;
this.urlList = new ArrayList<String>();
}
public void crawl(String url) throws IOException {
Document doc = Jsoup.connect(url).ignoreContentType(true).get();
Elements links = doc.select("a[href]");
//here I found the problem why the crawler doesn't work as I want,
//but it isn't my actual issue, i want to be able to run it again in console
for (Element link : links) {
String actualUrl = link.attr("abs:href");
if (!urlList.contains(actualUrl) & actualUrl.startsWith(baseUrl)){
print(" * a: <%s> (%s)", actualUrl, trim(link.text(), 35));
urlList.add(actualUrl);
crawl(actualUrl);
}
}
}
private static void print(String msg, Object... args) {
System.out.println(String.format(msg, args));
}
private static String trim(String s, int width) {
if (s.length() > width)
return s.substring(0, width-1) + ".";
else
return s;
}
public String getBaseUrl(){
return baseUrl;
}
public void setBaseUrl(String url){
baseUrl = url;
}
public ArrayList<String> getUrlList(){
return urlList;
}
}
这是Main.Java
package com.webcrawler.jsoupexample;
import java.io.IOException;
public class Main {
public static void main(String[] args) throws IOException {
String url = "http://elfreneticoinformatico.com";
ParserEngine parser = new ParserEngine(url);
parser.crawl(parser.getBaseUrl());
System.out.println("Crawler finished. Total URLs: " + parser.getUrlList().size());
}
}
有人可以帮忙吗?
解决方案
如果您没有使用像样的 Java IDE(集成开发环境),我建议您使用 NetBeans、IntelliJ IDEA 或 Eclipse。NetBeans 可以更容易开始,因为它可以直接与 Maven 项目一起使用。IDE 甚至可以在您进行构建之前检测到问题。
...更新我已经尝试使用 Mavenclean
和compile
目标构建它,并使用IntelliJ IDEA Community 2020.3
and in运行它,NetBeans 12.2
并且......它构建并运行良好,所以我不确定发生了什么。NetBeans 构建输出为:
cd P:\Users\infernoz\Documents\Java_Projects\infernoz\jsoup-question; "JAVA_HOME=C:\\Program Files\\AdoptOpenJDK\\jdk-8.0.275.1-hotspot" cmd /c "\"C:\\Program Files\\NetBeans-12.2\\netbeans\\java\\maven\\bin\\mvn.cmd\" -Dmaven.ext.class.path=\"C:\\Program Files\\NetBeans-12.2\\netbeans\\java\\maven-nblib\\netbeans-eventspy.jar\" --errors --errors clean compile"
[INFO] Error stacktraces are turned on.
[INFO] Scanning for projects...
[INFO]
[INFO] --------------< com.webcrawler.jsoupexample:jsoupexample >--------------
[INFO] Building jsoupexample 1.0-SNAPSHOT
[INFO] --------------------------------[ jar ]---------------------------------
[INFO]
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ jsoupexample ---
[INFO] Deleting P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target
[INFO]
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ jsoupexample ---
[WARNING] Using platform encoding (Cp1252 actually) to copy filtered resources, i.e. build is platform dependent!
[INFO] Copying 0 resource
[INFO]
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ jsoupexample ---
[INFO] Changes detected - recompiling the module!
[WARNING] File encoding has not been set, using platform encoding Cp1252, i.e. build is platform dependent!
[INFO] Compiling 2 source files to P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target\classes
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 4.496 s
[INFO] Finished at: 2020-12-22T02:06:04Z
[INFO] ------------------------------------------------------------------------
IntelliJ 运行输出为:
"C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\bin\java.exe" -javaagent:C:\Users\infernoz\AppData\Local\JetBrains\Toolbox\apps\IDEA-C\ch-1\203.5981.155\lib\idea_rt.jar=65151:C:\Users\infernoz\AppData\Local\JetBrains\Toolbox\apps\IDEA-C\ch-1\203.5981.155\bin -Dfile.encoding=UTF-8 -classpath "C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\charsets.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\access-bridge-64.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\cldrdata.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\dnsns.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\jaccess.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\localedata.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\nashorn.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunec.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunjce_provider.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunmscapi.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\sunpkcs11.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\ext\zipfs.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jce.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jfr.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\jsse.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\management-agent.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\resources.jar;C:\Program Files\AdoptOpenJDK\jdk-8.0.275.1-hotspot\jre\lib\rt.jar;P:\Users\infernoz\Documents\Java_Projects\rwperrott\jsoup-question\target\classes;M:\Repository\org\jsoup\jsoup\1.13.1\jsoup-1.13.1.jar" com.webcrawler.jsoupexample.Main
Crawler finished. Total URLs: 0
Process finished with exit code 0
推荐阅读
- python - 动态计算聚合 Django 中的对象
- php - 如何在变量数组foreach中显示所有数据?
- ignite - 点燃持久。架构表有时会消失
- javascript - popper.js 不推荐使用由空格分隔的偏移量,请改用逗号 (,)
- javascript - 在内容中搜索多个参考字段
- apache-spark - 哪个开源框架最适合 ETL Apache Airflow 或 Apache Beam?
- ruby-on-rails - Rails 5.2 关联回调未在 before_add 或 before_remove 上触发
- python - 将循环中的 DF 保存为 Python 中的全局 DF
- ksqldb - KSQL 中的 Collect_LIST 规范
- html - 链接到同一页面中的其他部分不起作用