python - 使用单元格值重命名 XLS 文件 - 删除空格和特殊字符
问题描述
情况:
我正在尝试使用每个文件中的特定单元格值重命名目录中的 XLS 文件(即单元格 A4 包含“Name1”,使用 A4 创建Name1.xls
)。我发现有一个脚本可以用于我的目的。
我试图解决的问题:
我试图用作文件名的每个单元格都有空格和特殊字符。理想情况下,我想删除所有特殊字符和空格,并将其用作命名每个文件的值。我对正则表达式不是很熟悉,所以我不确定是否应该修改 fileNameCheck = re.compile('[^\w,\s-]')
代码的一部分,或者如果不阻塞则先修改...
见下面的代码:
# Import required modules
import openpyxl
import os
import re
import shutil
# File path
filePath = 'C:\\Users\name\Documents\Python\folder'
# Cell containing new file name
cellForFileName = 'A3'
# Check to see if the file path exists
if os.path.exists(filePath):
# Change the current working directory
os.chdir(filePath)
# Check if there are any files in the chosen directory
if len(os.listdir(filePath)) == 0:
print('There are no files to rename')
else:
# Renamed file count
filesRenamed = 0
# Process the files at the path
for filename in os.listdir(filePath):
# Check if the file is an Excel file, excluding temp files
if filename.endswith('.xls.xlsx') and not filename.startswith('~'):
try:
# Open the file and find the first sheet
workbook = openpyxl.load_workbook(filename)
worksheet = workbook.worksheets[0]
# Check if there is a value in the cell for the new file name
if worksheet[cellForFileName].value is not None:
# Check to see if the cell value is valid for a file name
fileNameCheck = re.compile('[^\w,\s-]')
if not fileNameCheck.search(worksheet[cellForFileName].value):
# Construct the new file name
newFileName = worksheet[cellForFileName].value + '.xlsx'
# Close the workbook
workbook.close()
# Rename the file
shutil.move(filename, newFileName)
# Output confirmation message
print('The file "' + filename + '" has been renamed to "'
+ newFileName + '".')
# Increment the count
filesRenamed += 1
else:
# Display a message saying the file could not be renamed
print('The file "' + filename + '" could not be renamed.')
# Close the workbook
workbook.close()
else:
# Display a message saying the file could not be renamed
print('The file "' + filename + '" could not be renamed.')
# Close the workbook
workbook.close()
except PermissionError as e:
# Display a message saying the file could not be renamed
print('The file "' + filename + '" could not be renamed.')
# Display a message regarding the number of files renamed
if filesRenamed == 1:
print(str(filesRenamed) + ' file has been renamed.')
else:
print(str(filesRenamed) + ' files have been renamed.')
else:
# Display a message stating that the file path does not exist
print('File path does not exist.')
提前感谢您提供的任何帮助、建议和提示!
解决方案
我认为filename.endswith('.xls.xlsx')
不会按照预期的方式工作,按照str.endswith的文档,您可以使用tuple
( .endswith(('.xls','.xlsx'))
. ) 来匹配两者.xls
,.xlsx
此外,如果您同时使用这两种类型的文件,最好了解原始扩展名并匹配重命名操作期间的后缀,因为它们以不同的方式解释。
... XLS 和 XLSX 格式存储的信息 [...] 大不相同。XLS 基于 BIFF(二进制交换文件格式),因此信息直接存储为二进制格式。另一方面,XLSX 基于 Office Open XML 格式,这是一种源自 XML 的文件格式... [ 1 ]
您可以使用_, extension = os.path.splitext(filename)
仅获取扩展部分以供稍后在重命名操作中使用。
要删除特殊字符和空格,您可以使用re.sub("[^a-zA-Z0-9]", "", nameCell)
. 如果:
允许后面的字符串包含only
特殊字符和空格,请确保在写入文件名之前测试是否为空字符串。
...
...
# Process the files at the path
for filename in os.listdir(filePath):
# get extension to use later on file rename
_, extension = os.path.splitext(filename)
if filename.endswith(('.xls','.xlsx')) and not filename.startswith('~'):
try:
workbook = openpyxl.load_workbook(filename)
worksheet = workbook.worksheets[0]
# get the text after the ":"
nameCell = re.search(":(.+)", worksheet[cellForFileName].value).group(1)
# or use str.split(":")[1], make sure the range exists
workbook.close()
if nameCell is not None:
# remove special characters and spaces
clearName = re.sub("[^a-zA-Z0-9]", "", nameCell)
newFileName = clearName + extension
shutil.move(filename, newFileName)
print('The file "' + filename + '" has been renamed to "'
+ newFileName + '".')
filesRenamed += 1
else:
print('The file "' + filename + '" could not be renamed.')
except PermissionError as e:
...
...
...
推荐阅读
- javascript - 尽管格式良好,Angular 9 和 ngx-admin nb 日期选择器无效
- python - Python在驱动程序初始化后添加add_argument
- python-3.x - 无法在 Python 中打开 .sav 文件(spss)(我去年打开的)
- python-3.x - 如何从 Python 中两个不同 DataFrame 的值计数中绘制图表
- javascript - 无法填充数组并返回
- r - 绘制二进制矩阵 - R
- http - 如何通过 rhel7 linux 服务器捕获传出的 http/https 请求?
- java - 当一个单元格被另一个工作表上的序列验证时, sheet.getDataValidations() 返回一个空列表
- docker - Docker 运行 Solr 错误:启动容器进程导致“exec:\”docker-entrypoint.sh\”:在 $PATH 中找不到可执行文件”:未知
- mongodb - MongoDB 聚合:计算每个 id 字段的每个值的出现次数