首页 > 解决方案 > 这是使用openpyxl删除Excel行的更快方法吗?

问题描述

我有一个 Excel 行号列表,我想使用 Openpyxl 以 2138 长度删除这些行号。这是代码:

delete_this_row = [1,2,....,2138]

for delete in delete_this_row:
    worksheet.delete_rows(delete)

但它太慢了。完成该过程需要 45 秒,直到 1 分钟。

这是完成任务的更快方法吗?

标签: pythonperformanceopenpyxl

解决方案


几乎总是有一种更快的方法来做某事。有时成本太高,但在这种情况下不是,我怀疑:-)

如果它只是您要删除的一组连续行,您可以使用:

worksheet.delete_rows(1, 2138)

此处的文档,为完整起见,复制如下:

delete_rows(idx, amount=1):从 中删除一行或多行row==idx

您的解决方案很慢,因为每次删除单行时,都必须将该点以下的所有内容向上移动一行,然后删除最后一行。

通过传入行数,它改为执行一次移位,将行2139..max直接向上移动到 rows 1..max-2138,然后删除下面的所有行max-2138

这可能比您现在的速度快大约 2,138 倍 :-)


如果您的数组中有任意行号,您仍然可以使用这种方法来尽可能优化它。

这里的想法是首先将您的行列表变成一个元组列表,其中每个元组都有:

  • 起始行;和
  • 要从那里删除的行数。

理想情况下,您还可以按相反的顺序生成它,这样您就可以按原样处理它。以下代码段显示了如何执行此操作,其中打印而不是调用 openpyxl 调用:

def reverseCombiner(rowList):
    # Don't do anything for empty list. Otherwise,
    # make a copy and sort.

    if len(rowList) == 0: return []
    sortedList = rowList[:]
    sortedList.sort()

    # Init, empty tuple, use first item for previous and
    # first in this run.

    tupleList = []
    firstItem = sortedList[0]
    prevItem = sortedList[0]

    # Process all other items in order.

    for item in sortedList[1:]:
        # If start of new run, add tuple and use new first-in-run.

        if item != prevItem + 1:
            tupleList = [(firstItem, prevItem + 1 - firstItem)] + tupleList
            firstItem = item

        # Regardless, current becomes previous for next loop.

        prevItem = item

    # Finish off the final run and return tuple list.

    tupleList = [(firstItem, prevItem + 1 - firstItem)] + tupleList
    return tupleList

# Test data, hit me with anything :-)

myList = [1, 70, 71, 72, 98, 21, 22, 23, 24, 25, 99]

# Create tuple list, show original and that list, then process.

tuples = reverseCombiner(myList)
print(f"Original: {myList}")
print(f"Tuples:   {tuples}\n")
for tuple in tuples:
    print(f"Would execute: worksheet.delete_rows({tuple[0]}, {tuple[1]})")

输出是:

Original: [1, 70, 71, 72, 98, 21, 22, 23, 24, 25, 99]
Tuples:   [(98, 2), (70, 3), (21, 5), (1, 1)]

Would execute: worksheet.delete_rows(98, 2)
Would execute: worksheet.delete_rows(70, 3)
Would execute: worksheet.delete_rows(21, 5)
Would execute: worksheet.delete_rows(1, 1)

推荐阅读