首页 > 解决方案 > Create folders based on filenames

问题描述

I have a folder with some 1500 excel files . The format of each file is something like this:

The first character of the file name is either '0' or '1' followed by 'd' followed by the date when the file was created followed by customer id(abcd,ef,g,hijkl,mno,pqr).The customer id has no fixed length and it can vary.

I want to create folders for each unique date(folder name should be date) and move the files with the same date into a single folder . So for the above example , 4 folders (20170101,20170104,20170109,20170110) has to be created with files with same dates copied into their respective folders.

I want to know if there is any way to do this in python ? Sorry for not posting any sample code because I have no idea as to how to start.

标签: pythonfileiterationdirectorybatch-processing

解决方案


Try this out:

import os
import re

root_path = 'test'


def main():
    # Keep track of directories already created
    created_dirs = []

    # Go through all stuff in the directory
    file_names = os.listdir(root_path)
    for file_name in file_names:
        process_file(file_name, created_dirs)


def process_file(file_name, created_dirs):
    file_path = os.path.join(root_path, file_name)

    # Check if it's not itself a directory - safe guard
    if os.path.isfile(file_path):
        file_date, user_id, file_ext = get_file_info(file_name)

        # Check we could parse the infos of the file
        if file_date is not None \
            and user_id is not None \
            and file_ext is not None:
            # Make sure we haven't already created the directory
            if file_date not in created_dirs:
                create_dir(file_date)
                created_dirs.append(file_date)

            # Move the file and rename it
            os.rename(
                file_path,
                os.path.join(root_path, file_date, '{}.{}'.format(user_id, file_ext)))

            print file_date, user_id


def create_dir(dir_name):
    dir_path = os.path.join(root_path, dir_name)
    if not os.path.exists(dir_path) or not os.path.isdir(dir_path):
        os.mkdir(dir_path)


def get_file_info(file_name):
    match = re.search(r'[01]d(\d{8})([\w+-]+)\.(\w+)', file_name)
    if match:
        return match.group(1), match.group(2), match.group(3)

    return None, None, None


if __name__ == '__main__':
    main()

Note that depending on the names of your files, you might want to change (in the future) the regex I use, i.e. [01]d(\d{8})([\w+-]+) (you can play with it and see details about how to read it here)...


推荐阅读