java - ForkJoin 在 1 个线程上运行
问题描述
我有以下代码,它是一种网站抓取工具的模拟,它抓取页面/子页面并将结果连接到包含页面内容的字符串。
我用过Runtime.getRuntime().availableProcessors()
,所以我假设它会在多个线程上运行。但情况似乎并非如此。
package Concurrency;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;
public class ForkJoinPoolDemo {
public static class MyTask extends RecursiveTask<String>
{
private String url;
public MyTask(String url)
{
this.url = url;
}
@Override
protected String compute() {
System.out.println(Thread.currentThread().getName() + "/" + url);
if(url.equals("http://google.com/b1")) {
return "Content from /b1";
} else if(url.equals("http://google.com/b2")) {
return "Content from /b2";
} else if(url.equals("http://google.com/b")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/b1"));
tasks.add(new MyTask("http://google.com/b2"));
String result = "Content from /b\n";
for(MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else if(url.equals("http://google.com")) {
List<MyTask> tasks = new ArrayList<>();
tasks.add(new MyTask("http://google.com/a"));
tasks.add(new MyTask("http://google.com/b"));
tasks.add(new MyTask("http://google.com/c"));
tasks.add(new MyTask("http://google.com/d"));
tasks.add(new MyTask("http://google.com/e"));
tasks.add(new MyTask("http://google.com/f"));
tasks.add(new MyTask("http://google.com/g"));
tasks.add(new MyTask("http://google.com/h"));
tasks.add(new MyTask("http://google.com/i"));
tasks.add(new MyTask("http://google.com/j"));
String result = "Content from /\n";
for (MyTask task : tasks) {
task.fork();
result += task.join() + "\n";
}
return result;
} else {
try {
Thread.sleep(1000);
} catch (InterruptedException e) {
e.printStackTrace();
}
return "Content from " + url;
}
}
}
public static void main(String[] args) {
ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
String result = pool.invoke(new MyTask("http://google.com"));
System.out.println(result);
}
}
为什么每个 fork 都在同一个线程上运行?
ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j
解决方案
每次生成新任务时,您都会阻止加入,等待它完成后再提交另一个任务。相反,首先生成所有任务,然后收集它们的结果:
for(MyTask task : tasks) {
task.fork();
}
for(MyTask task : tasks) {
result += task.join() + "\n";
}
推荐阅读
- python - 如何在分类类型上设置索引?
- memgraphdb - Memgraph 数据库问题:复制、分区、并行化、图存储
- c# - c# Selenium,选择特定的 Firefox Profile
- javascript - javascript - 方法调用中的 var 未重新定义
- php - 使用其索引将字符串中的第一个字符替换为多个字符
- ansible - 通过 sudo Ansible-Playbook
- ios - UIView(放置在滚动视图内)在以编程方式更改高度常数时不显示其子视图
- neural-network - 将 ANN 拟合到函数的反向传播错误
- arrays - 编译器在数组末尾输出 0
- php - 在 Woocommerce 中以编程方式添加新产品类别