首页 > 解决方案 > ForkJoin 在 1 个线程上运行

问题描述

我有以下代码,它是一种网站抓取工具的模拟,它抓取页面/子页面并将结果连接到包含页面内容的字符串。

我用过Runtime.getRuntime().availableProcessors(),所以我假设它会在多个线程上运行。但情况似乎并非如此。

package Concurrency;

import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.ForkJoinPool;
import java.util.concurrent.RecursiveTask;

public class ForkJoinPoolDemo {

    public static class MyTask extends RecursiveTask<String>
    {
        private String url;
        public MyTask(String url)
        {
            this.url = url;
        }

        @Override
        protected String compute() {

            System.out.println(Thread.currentThread().getName() + "/" + url);

            if(url.equals("http://google.com/b1")) {
                return "Content from /b1";
            } else if(url.equals("http://google.com/b2")) {
                return "Content from /b2";
            } else if(url.equals("http://google.com/b")) {
                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/b1"));
                tasks.add(new MyTask("http://google.com/b2"));
                String result = "Content from /b\n";

                for(MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else if(url.equals("http://google.com")) {

                List<MyTask> tasks = new ArrayList<>();
                tasks.add(new MyTask("http://google.com/a"));
                tasks.add(new MyTask("http://google.com/b"));
                tasks.add(new MyTask("http://google.com/c"));
                tasks.add(new MyTask("http://google.com/d"));
                tasks.add(new MyTask("http://google.com/e"));
                tasks.add(new MyTask("http://google.com/f"));
                tasks.add(new MyTask("http://google.com/g"));
                tasks.add(new MyTask("http://google.com/h"));
                tasks.add(new MyTask("http://google.com/i"));
                tasks.add(new MyTask("http://google.com/j"));
                String result = "Content from /\n";

                for (MyTask task : tasks) {
                    task.fork();
                    result += task.join() + "\n";
                }
                return result;
            } else {
                try {
                    Thread.sleep(1000);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
                return "Content from " + url;
            }
        }
    }

    public static void main(String[] args) {
        ForkJoinPool pool = new ForkJoinPool(Runtime.getRuntime().availableProcessors());
        String result = pool.invoke(new MyTask("http://google.com"));
        System.out.println(result);
    }
}

为什么每个 fork 都在同一个线程上运行?

ForkJoinPool-1-worker-19/http://google.com
ForkJoinPool-1-worker-19/http://google.com/a
ForkJoinPool-1-worker-19/http://google.com/b
ForkJoinPool-1-worker-19/http://google.com/b1
ForkJoinPool-1-worker-19/http://google.com/b2
ForkJoinPool-1-worker-19/http://google.com/c
ForkJoinPool-1-worker-19/http://google.com/d
ForkJoinPool-1-worker-19/http://google.com/e
ForkJoinPool-1-worker-19/http://google.com/f
ForkJoinPool-1-worker-19/http://google.com/g
ForkJoinPool-1-worker-19/http://google.com/h
ForkJoinPool-1-worker-19/http://google.com/i
ForkJoinPool-1-worker-19/http://google.com/j

标签: javamultithreadingfork-join

解决方案


每次生成新任务时,您都会阻止加入,等待它完成后再提交另一个任务。相反,首先生成所有任务,然后收集它们的结果:

for(MyTask task : tasks) {
  task.fork();
}
        
for(MyTask task : tasks) {
  result += task.join() + "\n";
}

推荐阅读