首页 > 解决方案 > How can I do an efficient in-place sort on an arbitrary indexable collection?

问题描述

Question

How can I use Python's inbuilt sort (or similar) to do an in-place sort on a arbitrary collection (which supports __len__, __getitem__ and __setitem__ methods)?

Background

The question arose from considering the following problem (somewhat related to this question). Suppose that I have two large lists of equal length:

hours   = [20, 21, 18, 18, 19]
minutes = [15, 14, 13, 12, 11]

(in this example representing the times 20:15, 21:14, etc).

I want to sort these lists together in-place as pairs of values from the two lists (such as those retured by zip), to give:

hours   = [18, 18, 19, 20, 21]
minutes = [12, 13, 11, 15, 14]

One could implement this by encapsulating both lists into a custom collection that uses paired values when indexing, and then sorting that collection.

Here's the collection class:

class Times:
    
    def __init__(self, hours, minutes):
        if len(hours) != len(minutes):
            raise ValueError('lists must be of same length')
        self.hours = hours
        self.minutes = minutes

    def __getitem__(self, i):
        return (self.hours[i], self.minutes[i])

    def __setitem__(self, i, vals):
        self.hours[i], self.minutes[i] = vals

    def __len__(self):
        return len(self.hours)

We can sort the underlying lists in a memory-efficient way by doing an in-place sort on the top-level collection. But the following example uses a CPU-inefficient sorting algorithm.

def bubble_sort(x):
    n = len(x)
    for last in range(n-1, 0, -1): 
        for pos in range(last):
              if x[pos] > x[pos+1] : 
                  x[pos], x[pos+1] = x[pos+1], x[pos] 

hours   = [20, 21, 18, 18, 19]
minutes = [15, 14, 13, 12, 11]
times = Times(hours, minutes)

bubble_sort(times)

print(hours)     # [18, 18, 19, 20, 21]
print(minutes)   # [12, 13, 11, 15, 14]

Alternatively, we can use Python's more efficient builtin sorted, but then we are allocating more memory because it is not sorting in-place.

hours   = [20, 21, 18, 18, 19]
minutes = [15, 14, 13, 12, 11]
times = Times(hours, minutes)

for i, v in enumerate(sorted(times)):
    times[i] = v

print(hours)    # [18, 18, 19, 20, 21]
print(minutes)  # [12, 13, 11, 15, 14]

So it would be desirable to be able to do an in-place sort using Python's inbuilt sort (or something similar). Here is a failed attempt to use list.sort on something other than a list -- it isn't supported:

>>> list.sort(times)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: descriptor 'sort' requires a 'list' object but received a 'instance'

How can this be done?

标签: pythonsortingin-place

解决方案


推荐阅读