python - 如何在 Python 中根据日期处理文件?
问题描述
我有两种文件,xml files
和txt files
. 这些文件的名称中有一个日期。如果日期与我想打开的xml file
日期匹配,则进行一些处理并将输出写入列表。之后,我想更改. 多个可以有相同的日期,但它是唯一的,所以这意味着超过 1可以与一个链接txt file
txt file
xml file
xml files
txt file
xml file
txt file.
现在我有一个问题。我的to_csv
列表包含 20200907 和 20201025 的数据。我不希望它那样工作。我希望我的to_csv
列表一次只做一个文件(因此是一个日期)。
output_xml = r"c:\desktop\energy\XML_Output"
output_txt = r"c:\desktop\energy\TXT_Output"
xml_name = os.listdir(output_xml )
txt_name = os.listdir(output_txt)
txt_name = [x.replace('-', '') for x in txt_name] #remove the - in the filenames
# Extract the date from the xml and txt files.
xml_dates = []
for file in xml_name:
find = re.search("_(.\d+)-", file).group(1)
xml_dates.append(find)
txt_dates = []
for file in txt_name:
find = re.search("MM(.+?)AB", file).group(1)
txt_dates.append(find)
#THIS IS SOME REPRODUCABLE OUTPUT FROM WHAT IS RECEIVED FROM ABOVE SNIPPET.
xml_dates = ['20200907', '20200908', '20201025', '20201025', '20201025', '20201025']
txt_dates = ['20200907', '20201025']
to_csv = []
for date_xml in xml_dates:
for date_txt in txt_dates:
if date_xml == date_txt:
match_txt = [s for s in txt_name if date_txt in s] # matching txt file
match_xml = [s for s in xml_name if date_xml in s] # matching xml file
match_txt_temp = match_txt[0]
match_txt_score = [match_txt_temp[:6]+'-'+match_txt_temp[6:8]+'-'+match_txt_temp[8:10]+'-'+match_txt_temp[10:12]+match_txt_temp[12:]]
with open(output_txt + "/" + match_txt_score[0], "r") as outer:
reader = csv.reader(outer, delimiter="\t")
for row in reader:
read = [row for row in reader if row]
for row in read:
energy_level = row[20]
if energy_level > 250:
to_csv.append(row)
print(to_csv)
电流输出:
[['1', '2', '3', '20200907', '4', '5'],
['1', '2', '3', '20200907', '4', '5'],
['1', '2', '3', '20200907', '4', '5'],
['1', '2', '3', '20201025, '4', '5'],
['1', '2', '3', '20201025, '4', '5']]
期望的输出:
[[['1', '2', '3', '20200907', '4', '5'],
['1', '2', '3', '20200907', '4', '5'],
['1', '2', '3', '20200907', '4', '5']],
['1', '2', '3', '20201025, '4', '5'],
['1', '2', '3', '20201025, '4', '5']]
解决方案
您说按日期只有一个 txt 文件,并且只想处理链接到 txt 文件的 xml 文件。这意味着对 txt_dates 进行一个循环就足够了:
...
for date_txt in txt_dates:
date_xml = date_txt
match_txt = [s for s in txt_name if date_txt in s] # the matching txt file
match_xml = [s for s in xml_name if date_xml in s] # possible matching xml files
if len(match_xml) == 0: # no matching xml files
continue
match_txt_temp = match_txt[0]
match_txt_score = [match_txt_temp[:6]+'-'+match_txt_temp[6:8]+'-'
+match_txt_temp[8:10]+'-'+match_txt_temp[10:12]
+match_txt_temp[12:]]
# prepare a new list for that date
curr = list()
with open(output_txt + "/" + match_txt_score[0], "r") as outer:
reader = csv.reader(outer, delimiter="\t")
for row in reader:
read = [row for row in reader if row]
for row in read:
energy_level = row[20]
if energy_level > 250:
curr.append(row)
if len(curr) > 0: # if the current date list is not empty append it
to_csv.append(curr)
print(to_csv)
请注意:由于您提供的不是可重现的示例,因此我无法测试上述代码,并且可能出现拼写错误...
推荐阅读
- php - [] 与 {} 的 PHP 数组语法
- javascript - Adobe Acrobat Pro Dc - 如何使用 get by name 命令在 3D 模型树中显示所有 brench
- mongodb - MongoError:没有可用的 mongos 代理
- laravel - 无法打开流或文件“/vagrant/storage/logs/laravel-****-**-**.log”:无法打开流:权限被拒绝
- visual-studio - 是什么导致 Visual Studio 生成 NLP 文件?
- linux - git push 和 git clone 有效,但 mvn release:branch 无效
- sql - 选择在多列上过滤的最新行
- c# - 将默认数据库设置为用户 SQL Server (Azure)
- css - Hugo 没有在本地渲染公用文件夹
- php - 在 Woocommerce 中的订阅到期时切换用户角色