python - 从目录读取的Python自定义排序文件
问题描述
我有一个具有以下结构的目录:
Main directory:
|--2001
|--200101
|--feed_013_01.zip
|--feed_restr_013_01.zip
|--feed_013_04.zip
|--feed_restr_013_04.zip
...
|--feed_013_30.zip
|--feed_restr_013_30.zip
...
|--2021
|--202101
|--feed_013_01.zip
|--feed_restr_013_01.zip
|--feed_013_04.zip
|--feed_restr_013_04.zip
...
|--feed_013_30.zip
|--feed_restr_013_30.zip
我需要按顺序阅读和排序 zip 文件:
feed_restr_013_30.zip, feed_013_30.zip.....feed_restr_013_01.zip, feed_013_01.zip
我目前正在做这样的事情:
def atoi(text):
return int(text) if text.isdigit() else text
def natural_keys(text):
return [atoi(c) for c in re.split(r'(\d+)', text)]
for path, subdirs, files in os.walk(directory):
subdirs.sort(key=natural_keys)
subdirs.reverse()
files.sort(key=natural_keys)
files.reverse()
它首先需要所有“restr”文件,我得到的列表如下:
feed_restr_013_30.zip,feed_restr_013_01.zip.....feed_013_30.zip, feed_013_01.zip
更新
我能够使用buran和SCKU的答案以及我现有的逻辑来解决这个问题
def atoi(text):
return int(text) if text.isdigit() else text
def parse(fname):
try:
prefix, *middle, n1, n2 = fname.split('_')
except:
prefix, *middle, n1 = fname.split('_')
n2 = ''
return (prefix, n1, [atoi(c) for c in re.split(r'(\d+)',n2)], ''.join(middle))
def get_Files(self, directory, source, keywords):
file_paths = []
for path, subdirs, files in os.walk(directory):
for file in files:
file_name = os.path.join(path, file)
file_paths.append(file_name)
return file_paths
files = get_Files(directory, source, keywords)
files.sort(key=parse, reverse=True)
解决方案
如果您的目录结构很好且不太大,我建议获取所有文件路径并立即对它们进行排序:
#get all file with path
all_files_path = []
for path, subdirs, files in os.walk(directory):
for f in files:
all_files_path.append(os.path.join(path, f))
# define custom sort key function
def which_items_you_want_to_compare(fpath):
#from buran's answer for sorting the part of file name
def parse(fname):
prefix, *middle, n1, n2 = fname.split('_')
return (prefix, n1, n2, ''.join(middle))
fpath_split = fpath.split(os.path.sep)
fn = fpath_split[-1] # file name 'feed_restr_013_01.zip'
sort_key_fn = parse(fn) # from buran's answer
d_ym = fpath_split[-2] # dir '202101'
d_y = fpath_split[-3] # dir '2021'
#compare with year first, then month (last two words in d_ym), then file name sort from buran's answer
return (int(d_y), int(d_ym[4:])) + sort_key_fn
sorted_res = sorted(all_files_path, key=which_items_you_want_to_compare, reverse=True)
如果不想倒序年份,可以使用-int(d_y)
key 函数中的 etc. 倒序。
推荐阅读
- sql - 如何使用动态 SQL 解决错误
- google-chrome-extension - 如何从 Puppeteer 调用 chrome.runtime.sendMessage?
- python - Pytest:如何在测试之间共享数据库状态
- swift - 定位服务在我的 macOS 应用程序中不起作用
- docker - FROM...AS 在 Dockerfile 中没有按我的预期工作
- doctrine-orm - 如何更改创建实体的默认位置?(php bin/控制台制作:实体)
- timestamp - 如何在grafana中将字符串时间戳列转换为时间戳
- tensorflow - 在 tensorflow 或 keras 中通过标准 mobileNet、VGG-16 或 AlexNet 从头开始训练 cifar 日期集的图像大小问题
- android - Android 应用程序中 webRTC 通话期间的录音
- android - RecyclerView 展开/折叠太快