首页 > 解决方案 > 使用单元格值重命名 XLS 文件 - 删除空格和特殊字符

问题描述

情况: 我正在尝试使用每个文件中的特定单元格值重命名目录中的 XLS 文件(即单元格 A4 包含“Name1”,使用 A4 创建Name1.xls)。我发现有一个脚本可以用于我的目的。

我试图解决的问题: 我试图用作文件名的每个单元格都有空格和特殊字符。理想情况下,我想删除所有特殊字符和空格,并将其用作命名每个文件的值。我对正则表达式不是很熟悉,所以我不确定是否应该修改 fileNameCheck = re.compile('[^\w,\s-]')代码的一部分,或者如果不阻塞则先修改...

见下面的代码:

# Import required modules
import openpyxl
import os
import re
import shutil

# File path

filePath = 'C:\\Users\name\Documents\Python\folder'

# Cell containing new file name
cellForFileName = 'A3'

# Check to see if the file path exists
if os.path.exists(filePath):

    # Change the current working directory
    os.chdir(filePath)

    # Check if there are any files in the chosen directory
    if len(os.listdir(filePath)) == 0:

        print('There are no files to rename')

    else:

        # Renamed file count
        filesRenamed = 0

        # Process the files at the path
        for filename in os.listdir(filePath):

            # Check if the file is an Excel file, excluding temp files
            if filename.endswith('.xls.xlsx') and not filename.startswith('~'):

                try:

                    # Open the file and find the first sheet
                    workbook = openpyxl.load_workbook(filename)
                    worksheet = workbook.worksheets[0]

                    # Check if there is a value in the cell for the new file name
                    if worksheet[cellForFileName].value is not None:

                        # Check to see if the cell value is valid for a file name
                        fileNameCheck = re.compile('[^\w,\s-]')
                        if not fileNameCheck.search(worksheet[cellForFileName].value):

                            # Construct the new file name
                            newFileName = worksheet[cellForFileName].value + '.xlsx'

                            # Close the workbook
                            workbook.close()

                            # Rename the file
                            shutil.move(filename, newFileName)

                            # Output confirmation message
                            print('The file "' + filename + '" has been renamed to "'
                                  + newFileName + '".')

                            # Increment the count
                            filesRenamed += 1

                        else:

                            # Display a message saying the file could not be renamed
                            print('The file "' + filename + '" could not be renamed.')

                            # Close the workbook
                            workbook.close()

                    else:

                        # Display a message saying the file could not be renamed
                        print('The file "' + filename + '" could not be renamed.')

                        # Close the workbook
                        workbook.close()

                except PermissionError as e:

                    # Display a message saying the file could not be renamed
                    print('The file "' + filename + '" could not be renamed.')

        # Display a message regarding the number of files renamed
        if filesRenamed == 1:
            print(str(filesRenamed) + ' file has been renamed.')
        else:
            print(str(filesRenamed) + ' files have been renamed.')

else:

    # Display a message stating that the file path does not exist
    print('File path does not exist.')

提前感谢您提供的任何帮助、建议和提示!

标签: pythonregexopenpyxl

解决方案


我认为filename.endswith('.xls.xlsx')不会按照预期的方式工作,按照str.endswith的文档,您可以使用tuple( .endswith(('.xls','.xlsx')). ) 来匹配两者.xls.xlsx此外,如果您同时使用这两种类型的文件,最好了解原始扩展名并匹配重命名操作期间的后缀,因为它们以不同的方式解释。

... XLS 和 XLSX 格式存储的信息 [...] 大不相同。XLS 基于 BIFF(二进制交换文件格式),因此信息直接存储为二进制格式。另一方面,XLSX 基于 Office Open XML 格式,这是一种源自 XML 的文件格式... [ 1 ]

您可以使用_, extension = os.path.splitext(filename)仅获取扩展部分以供稍后在重命名操作中使用。

要删除特殊字符和空格,您可以使用re.sub("[^a-zA-Z0-9]", "", nameCell). 如果:允许后面的字符串包含only特殊字符和空格,请确保在写入文件名之前测试是否为空字符串。

...
...
    # Process the files at the path
    for filename in os.listdir(filePath):
        # get extension to use later on file rename
        _, extension = os.path.splitext(filename)
        if filename.endswith(('.xls','.xlsx')) and not filename.startswith('~'):
            try:
                workbook = openpyxl.load_workbook(filename)
                worksheet = workbook.worksheets[0]
                # get the text after the ":"
                nameCell = re.search(":(.+)", worksheet[cellForFileName].value).group(1)
                # or use str.split(":")[1], make sure the range exists
                workbook.close()

                if nameCell is not None:
                    # remove special characters and spaces
                    clearName = re.sub("[^a-zA-Z0-9]", "", nameCell)
                    newFileName = clearName + extension
                    shutil.move(filename, newFileName)
                    print('The file "' + filename + '" has been renamed to "'
                            + newFileName + '".')
                    filesRenamed += 1
                else:
                    print('The file "' + filename + '" could not be renamed.')

            except PermissionError as e:
            ...
    ...
    ...

推荐阅读