首页 > 解决方案 > 如何在 Python 中根据日期处理文件?

问题描述

我有两种文件,xml filestxt files. 这些文件的名称中有一个日期。如果日期与我想打开的xml file日期匹配,则进行一些处理并将输出写入列表。之后,我想更改. 多个可以有相同的日期,但它是唯一的,所以这意味着超过 1可以与一个链接txt filetxt filexml filexml filestxt filexml filetxt file.

现在我有一个问题。我的to_csv列表包含 20200907 和 20201025 的数据。我不希望它那样工作。我希望我的to_csv列表一次只做一个文件(因此是一个日期)。

output_xml = r"c:\desktop\energy\XML_Output"
output_txt = r"c:\desktop\energy\TXT_Output"

xml_name = os.listdir(output_xml )
txt_name = os.listdir(output_txt)
txt_name = [x.replace('-', '') for x in txt_name] #remove the - in the filenames

# Extract the date from the xml and txt files. 
xml_dates = []
for file in xml_name:
    find = re.search("_(.\d+)-", file).group(1)
    xml_dates.append(find)

txt_dates = []
for file in txt_name:
    find = re.search("MM(.+?)AB", file).group(1)
    txt_dates.append(find)

#THIS IS SOME REPRODUCABLE OUTPUT FROM WHAT IS RECEIVED FROM ABOVE SNIPPET.
xml_dates = ['20200907', '20200908', '20201025', '20201025', '20201025', '20201025']
txt_dates = ['20200907', '20201025']

to_csv = []

for date_xml in xml_dates:
    for date_txt in txt_dates:
        if date_xml == date_txt:

              match_txt = [s for s in txt_name if date_txt in s]  # matching txt file  
              match_xml = [s for s in xml_name if date_xml in s]  # matching xml file

              match_txt_temp = match_txt[0]
              match_txt_score = [match_txt_temp[:6]+'-'+match_txt_temp[6:8]+'-'+match_txt_temp[8:10]+'-'+match_txt_temp[10:12]+match_txt_temp[12:]]

              with open(output_txt + "/" + match_txt_score[0], "r") as outer:
                reader = csv.reader(outer, delimiter="\t")  

                for row in reader:
                    read = [row for row in reader if row]
                    for row in read:
  
                        energy_level = row[20]

                        if energy_level > 250:
                            to_csv.append(row)
                            
print(to_csv)

电流输出:

[['1', '2', '3', '20200907', '4', '5'], 
['1', '2', '3', '20200907', '4', '5'], 
['1', '2', '3', '20200907', '4', '5'], 
['1', '2', '3', '20201025, '4', '5'], 
['1', '2', '3', '20201025, '4', '5']]

期望的输出:

[[['1', '2', '3', '20200907', '4', '5'], 
['1', '2', '3', '20200907', '4', '5'], 
['1', '2', '3', '20200907', '4', '5']], 
['1', '2', '3', '20201025, '4', '5'], 
['1', '2', '3', '20201025, '4', '5']]

标签: pythonlistfor-loop

解决方案


您说按日期只有一个 txt 文件,并且只想处理链接到 txt 文件的 xml 文件。这意味着对 txt_dates 进行一个循环就足够了:

...
for date_txt in txt_dates:
    date_xml = date_txt

    match_txt = [s for s in txt_name if date_txt in s]  # the matching txt file  
    match_xml = [s for s in xml_name if date_xml in s]  # possible matching xml files
    if len(match_xml) == 0:   # no matching xml files
        continue

    match_txt_temp = match_txt[0]
    match_txt_score = [match_txt_temp[:6]+'-'+match_txt_temp[6:8]+'-'
                       +match_txt_temp[8:10]+'-'+match_txt_temp[10:12]
                       +match_txt_temp[12:]]

    # prepare a new list for that date
    curr = list()

    with open(output_txt + "/" + match_txt_score[0], "r") as outer:
        reader = csv.reader(outer, delimiter="\t")  

        for row in reader:
            read = [row for row in reader if row]
            for row in read:
                energy_level = row[20]
                if energy_level > 250:
                    curr.append(row)

    if len(curr) > 0:    # if the current date list is not empty append it
        to_csv.append(curr)
                        
print(to_csv)

请注意:由于您提供的不是可重现的示例,因此我无法测试上述代码,并且可能出现拼写错误...


推荐阅读