首页 > 解决方案 > 当有多个最大值时,最大值在数组中的索引,均匀分布,并且最大值的个数已知

问题描述

这是我看到的一个没有好的解决方案的面试问题。
问题的第一部分是:
给定一个整数向量,找到最大值的索引。但是,如果有多个最大值 - 您希望最大值的每个索引都有相同的概率被选择。

例如:如果我们有向量(0,1,2,2,2,2),那么索引 2 有 0.25 的概率被选择(索引 3、4、5 也是如此)。

您可以像这样在 C++ 中解决它:

#include <iostream>
#include <bits/stdc++.h> 
using namespace std;

int getIndex(const vector<int>& numbers) {
  int maxNum = numeric_limits<int>::min();
  size_t idx = -1;
  size_t maxNumCtr = 0;

  for(size_t i = 0; i<numbers.size(); ++i) {
    int num = numbers[i];
    if(num > maxNum) {
      idx = i;
      maxNum = num;
      maxNumCtr = 1;
    } else if (num == maxNum) {
      maxNumCtr ++;
      idx = ((rand() % maxNumCtr) == 0) ? i : idx; 
    }
  }
  return idx;
} 

第二部分是:
现在你有一个额外的函数参数,它指示向量中最大值的出现次数。尝试改善您编写的算法的运行时间。

我的想法是,您可以rand()在函数开始时只计算一次以找到均匀分布的最大索引,并使用一些计数器变量来了解何时在循环中获得正确的最大索引。但这并没有提高运行时间,因为randO(1).

有更好的主意吗?

标签: c++algorithmperformance

解决方案


Just because something has the same big-O complexity doesn't mean it has the same runtime. The interview is asking for a change in the constant factor, not a complexity improvement.

Here's how I'd do it without the count

// Undefined if first == last
template<typename ForwardIterator, typename URBG>
int getIndex(ForwardIterator first, ForwardIterator last, URBG&& gen)
{
    int max = *std::max_element(first, last);
    std::vector<double> weights;
    std::transform(numbers.begin(), numbers.end(), std::back_inserter(weights), [max](int i){ return i == max; });
    // or auto weights = numbers | ranges::view::transform([max](int i){ return i == max; });

    std::discrete_distribution<int> dis(weights.begin(), weights.end());
    return dis(gen);
}

That has 2 passes over the data, and constructs one std::discrete_distribution of the same size

If you instead already had a count of how many to pick from, you could do it in one pass and use a std::uniform_int_distribution

// Undefined if first == last
template<typename ForwardIterator, typename URBG>
int getIndex(ForwardIterator first, ForwardIterator last, size_t max_count, URBG&& gen)
{
    std::uniform_int_distribution<> dis(1, max_count);
    std::size_t nth = dis(gen);

    ForwardIterator best = first;
    std::size_t nth_best = nth;

    for (ForwardIterator it = first; it != last; ++it)
    {
        if (*best < *it)
        {
            // new max
            best = it;
            nth_best = nth;
        }
        else if (*it < *best) {} // nothing
        else if ((--nth_best) == 0) 
        {
            // found nth of this value
            best = it;
        }
    }

    return std::distance(first, best);
}

推荐阅读