首页 > 解决方案 > 使用 csv 模块和 datetime 模块解析时间戳

问题描述

datetime在 Python 中的模块遇到了一些问题。我有来自 csv 文件的数据:

user_id,timestamp
563,0:00:21
671,0:00:26
780,0:00:28

这是我的代码:

import csv
from datetime import datetime

path = "/home/haldrik/dev/python/data/dataset.csv"
file = open(path, newline='')

reader = csv.reader(file, delimiter=',')

header = next(reader) # Ignore first row.

data = []
for row in reader:
    # row = [user_id, timestamp]
    user_id = row[0]
    timestamp = datetime.strptime(row[1], '%H:%M:%S').time()
    
    data.append([user_id, timestamp])

该代码引发此错误:

Traceback (most recent call last):
  File "/home/haldrik/dev/python/instances_web_site.py", line 15, in <module>
    date = datetime.strptime(row[1], '%H:%M:%S').time()
  File "/usr/lib/python3.8/_strptime.py", line 568, in _strptime_datetime
    tt, fraction, gmtoff_fraction = _strptime(data_string, format)
  File "/usr/lib/python3.8/_strptime.py", line 349, in _strptime
    raise ValueError("time data %r does not match format %r" %
ValueError: time data '' does not match format '%H:%M:%S'

我找不到错误在哪里。我可以看到数据格式符合指定的时间格式。

倾析 cvs 导入步骤,我可以确保它工作,请看这段代码(不包含在上面的代码中):

data_import = [row for row in reader]
print(data_import[0])

它输出这个:

['563','0:00:21']

标签: pythoncsvparsingtime

解决方案


  • 您对时间戳列中的一个或多个值有疑问,其中一行看起来像440,并且将导致time data '' does not match format '%H:%M:%S'
  • 包裹date = datetime.strptime(row[1], '%H:%M:%S').time()在一个try-except块中。

test.csv

user_id,timestamp
563,0:00:21
671,0:00:26
780,0:00:28
440,

代码

import csv
from datetime import datetime

path = "test.csv"
file = open(path, newline='')

reader = csv.reader(file, delimiter=',')

header = next(reader) # Ignore first row.

data = []
for row in reader:
    # row = [user_id, timestamp]
    user_id = row[0]
    try:
        timestamp = datetime.strptime(row[1], '%H:%M:%S').time()
    except ValueError as e:
        timestamp = row[1]
#         continue  # use this if you do not want the row added to data, comment it out otherwise
    
    data.append([user_id, timestamp])


print(data)
[out]:
[['563', datetime.time(0, 0, 21)], ['671', datetime.time(0, 0, 26)], ['780', datetime.time(0, 0, 28)], ['440', '']]

推荐阅读