首页 > 解决方案 > 您如何将流分组为给定大小的集合?

问题描述

我有一个来自数据库的数据流,我想使用 Java 流 API 来迭代并选择X它们的数量并将其输出到另一个流。顺序无所谓

类似的东西

source
.collect(x -> collectNItems(10)) // basically from the stream choose 10 items
.flatmap(collectedItems -> Stream.of(collectedItems))
.map(x -> buildElasticSearchBatchInsertRequest(x))
.forEach(request -> insertToElasticSearch(request));

说来源是1,2,3,4,5,6,7,8,9,10,11,12,13,14

在 flatMap 之后,我应该得到一个流

(12, 1, 14, 3)
(2, 4, 5 , 8, 7, 6, 10, 11, 9, 13)

(再次顺序无关紧要)我只需要将它组合在一起。

这样做的主要用例是根据数据库数据对 Elasticsearch 进行批量插入,因为一次插入很慢,并且批量插入整个内容会占用大量内存。

标签: javajava-stream

解决方案


发现一个埋在https://code-examples.net/en/q/1d38ce7

使用

@Test
public void try2() {
  Stream<Integer> stream = Stream.of(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14);
  BatchSpliterator.decorate(stream, 10)
    .forEach(System.out::println);
}

和装饰器,只是从原始链接稍微调整了一下,因为如果未知BatchSpliterator,估计大小应该返回MAX_VALUE

public class BatchSpliterator<E> implements Spliterator<List<E>> {

    public static <E> Stream<List<E>> decorate(Stream<E> originalStream, int batchSize) {
        return StreamSupport.stream(new BatchSpliterator<>(originalStream.spliterator(), batchSize), originalStream.isParallel());
    }

    private final Spliterator<E> base;

    private final int batchSize;

    private BatchSpliterator(Spliterator<E> base, int batchSize) {
        this.base = base;
        this.batchSize = batchSize;
    }

    @Override
    public boolean tryAdvance(Consumer<? super List<E>> action) {
        final List<E> batch = new ArrayList<>(batchSize);
        for (int i = 0; i < batchSize; i++) {
            base.tryAdvance(batch::add);
        }
        if (batch.isEmpty()) {
            return false;
        }
        action.accept(batch);
        return true;
    }

    @Override
    public Spliterator<List<E>> trySplit() {
        if (base.estimateSize() <= batchSize)
            return null;
        final Spliterator<E> splitBase = this.base.trySplit();
        return splitBase==null ? null
            :new BatchSpliterator<>(splitBase, batchSize);
    }

    @Override
    public long estimateSize() {
        final long baseSize = base.estimateSize();
        return baseSize==Long.MAX_VALUE ? baseSize
            :(long) Math.ceil(baseSize / (double) batchSize);
    }

    @Override
    public int characteristics() {
        return base.characteristics();
    }

    @Override
    public boolean hasCharacteristics(int characteristics) {
        return base.hasCharacteristics(characteristics);
    }

    @Override
    public Comparator<? super List<E>> getComparator() {
        throw new UnsupportedOperationException("getComparator");
    }
}

推荐阅读