首页 > 解决方案 > 对 > 50k 个条目的映射进行排序需要很长时间。有没有更快的方法在飞镖中对地图进行排序?

问题描述

我有一个从神经网络返回的概率值。返回的列表大小为 50,257,因此有很多值。列表看起来像[-126.32508850097656, -126.77257537841797, -127.69950866699219, -129.98387145996094, ......]

我需要前 K 值及其索引。所以我将列表转换为地图:

final temp = outputLogits.asMap();

然后使用以下方法对它们进行排序:

var sortedKeys = temp.keys.toList(growable: false)
      ..sort((k1, k2) => temp[k2].compareTo(temp[k1]));

它产生了预期的结果,但问题是它需要的时间太长。

我做错了吗?有没有更有效的方法来获得相同的结果?

更多详细信息:

未排序的列表如下所示:

[-126.32508850097656, -126.77257537841797, -127.69950866699219, -129.98387145996094, -128.03782653808594, -128.08395385742188, -126.33218383789062, -126.6927261352539, -127.6688232421875, -126.58303833007812, -127.32843017578125, -126.1390380859375, -126.54962158203125, -126.38087463378906, -127.82595825195312, -126.3281021118164, -125.81211853027344, -126.20887756347656, -125.95697784423828, -126.07755279541016, -126.35894012451172, -126.70021057128906, -127.03215026855469, -126.67304992675781, -126.92938995361328, -126.64434814453125, -128.20814514160156, -127.24195861816406, -128.25816345214844, -126.73397827148438, -127.62574768066406, -128.8334197998047, -124.46258544921875, -126.03125762939453, -126.18477630615234, -125.85749053955078, -126.11980438232422, -125.64325714111328, -126.06704711914062, -126.35154724121094, -124.83910369873047, -126.90412902832031, -126.02999877929688, -126.60641479492188, -125.97348022460938, -126.56074523925781, -126.58230590820312, -126.49268341064453, -128.5759735107422,

我需要找到前 40 个概率,以及它们的索引,我使用以下方法来实现:

final temp = outputLogits.asMap();                            // converts the above list to a Map<int, double>
    // sort the map values descending
    // then take the largest 40 values
    var sortedKeys = temp.keys.toList(growable: false)
      ..sort((k1, k2) => temp[k2].compareTo(temp[k1]));           
    final Map<int, double> sortedMap = {};

    for (final key in sortedKeys.take(40)) {                    
      sortedMap[key] = temp[key];
    }

排序后是sortedMap这样的:

{198: -117.52079772949219, 383: -118.29053497314453, 887: -119.25838470458984, 1119: -119.66973876953125, 632: -119.74752807617188, 628: -119.87970733642578, 554: -119.88958740234375, 1081: -119.9058837890625, 843: -120.10496520996094, 317: -120.21776580810547, 2102: -120.23406982421875, 770: -120.31946563720703, 2293: -120.40717315673828, 1649: -120.44376373291016, 366: -120.47624969482422, 2080: -120.4794921875, 2735: -120.74302673339844, 3244: -120.89102935791016, 2893: -120.97686004638672, 314: -120.98660278320312, 5334: -121.00469970703125, 1318: -121.03706359863281, 679: -121.12769317626953, 1881: -121.14120483398438, 1629: -121.18737030029297, 50256: -121.19244384765625, 357: -121.22344207763672, 1550: -121.27531433105469, 775: -121.31112670898438, 7486: -121.3316421508789, 921: -121.37474060058594, 1114: -121.43411254882812, 2312: -121.43602752685547, 1675: -121.51364135742188, 4874: -121.5697021484375, 1867: -121.57322692871094, 1439: -121.60330963134766, 8989: -121.60348510742188, 1320: -121.604621

我需要最高值及其各自的索引,这就是转换为 Map 的原因

标签: sortingdart

解决方案


尝试以下操作:

void main() {
  final temp = [
    -126.32508850097656,
    -126.77257537841797,
    -127.69950866699219,
    -129.98387145996094,
    -128.03782653808594,
    -128.08395385742188,
    -126.33218383789062,
    -126.6927261352539,
    -127.6688232421875,
    -126.58303833007812,
    -127.32843017578125,
  ];

  final filteredLogitsWithIndexes = Map.fromEntries(
      (temp.asMap().entries.toList(growable: false)
            ..sort((e1, e2) => e2.value.compareTo(e1.value)))
          .take(5));

  print(filteredLogitsWithIndexes);
  // {0: -126.32508850097656, 6: -126.33218383789062, 9: -126.58303833007812,
  // 7: -126.6927261352539, 1: -126.77257537841797}
}

这应该可以为您节省大量时间,因为我们不需要为每次比较在地图中进行查找(因为 aMapEntry包含keyvalue)。


推荐阅读