algorithm - 找到这个序列的前 k 个元素

第一步是对数组进行排序。

然后，我们维护一组候选子集，其大小等于k - i，如果i是已经找到的子集的数量。

下一个选择的子集对应于这组候选中总和最小的子集。

然后，每次添加一个新的解决方案时，S = {i0, i1, i2}例如对应于一个集合，其中iu是索引，那么我们必须将所有的作为附加候选者S U {j}，其中j的索引大于i2，并且这里U表示联合。

在实践中，我们只需要记住最后一个索引的值，所以j在这里。我们还需要保留相应的总和值，以便能够对候选集进行排序。

我们可以为这组候选者使用最小堆。但是，为了限制复杂性，我们需要限制候选者的数量（k-i）。为此，可以使用最小-最大堆。由于 C++ 标准库中没有这样的结构，我std::multiset在下面的实现中使用了 a。

#include <iostream>
#include <vector>
#include <algorithm>
#include <set>
#include <iterator>

struct Subset {
    int sum;
    int index_max;
};

std::vector<int> solution (std::vector<int>& A, int n, int k) {
    auto Comp = [] (const Subset& s0, const Subset& s1) {
        if (s0.sum == s1.sum) {
            return s0.index_max < s1.index_max;
        }
        return s0.sum < s1.sum;
    };
    
    std::sort (A.begin(), A.end());
    
    std::vector<int> sums;
    sums.reserve (k);
    std::multiset<Subset, decltype(Comp)> candidates (Comp);
    for (int i = 0; i < k; ++i) {
           candidates.insert ({A[i], i});
    }

    for (int i = 0; i < k; ++i) {
        //auto n = Cells.size();
        auto top = candidates.begin();
        auto best = *top;
        candidates.erase (top);
        sums.push_back (best.sum);
        
        if (i == k-1) continue;
        auto last = std::prev(candidates.end());
        auto vmax = (*last).sum;
        
        for (int j = best.index_max + 1; j < k; ++j) {
            int new_sum = best.sum + A[j];
            if (new_sum >= vmax) break;
            candidates.insert({new_sum, j});
            candidates.erase (last);
            last = std::prev(candidates.end());
            vmax = (*last).sum;
        }
    }
    return sums;
}

int main() {
    std::vector<int> A = {1, 2, 3, 6, 6, 7};
    int n = A.size();
    int k = n;
    auto sums = solution (A, n, k);
    for (int x: sums) {
        std::cout << x << " ";
    }
    std::cout << std::endl;
    return 0;
}

复杂性：

从理论上评估复杂性并不容易。

第一步，排序是 O(n logn)。

对于第二步，复杂性取决于 multiset 中插入/删除的数量candidates。每个插入成本 O(logk)。我无法从理论上评估插入的数量。

k = n我针对数据值的不同大小和不同范围做了几个基准测试。

我测量到插入的数量（在内部for循环中）相当稳定并且大约等于n log2(n)/2，除了非常低的数据范围之外，该值会减小。

使用这个测量值，对于n = k，我们得到的全局复杂度O(n logn^2)。

对于n = 200000，在我的 PC 上，每次大约需要 1 秒才能得到结果。

algorithm - 找到这个序列的前 k 个元素

问题描述

解决方案

推荐阅读