首页 > 解决方案 > cannot read csv with date parsing, dtypes and chunksize

问题描述

I have migrated a script from python 3.6 to python 3.7 and with the same code I get the following error when I try to read a .csv with pandas:

ValueError: not all elements from date_cols are numpy arrays

My .csv looks like this:

ID;BEGIN_DAT

160788043;27.10.2019 00:03

160788044;27.10.2019 00:10

160788045;27.10.2019 00:06

160788046;27.10.2019 00:09

Here is my code:

import pandas as pd

csv_path = 'test3.csv'
delimiter = ';'
chunksize = 50000

CSV_TYPES = {'ID': float,
             'BEGIN_DAT': 'category'}

date_columns = ['BEGIN_DAT']

csv_data = pd.read_csv(csv_path,
                           delimiter=delimiter,
                           encoding="Latin-1",
                           parse_dates=date_columns,
                           chunksize=chunksize,
                           dtype=CSV_TYPES)

csv_data


for chunk in csv_data:
    chunk # call - write chunk to database

csv_data gives me:

<pandas.io.parsers.TextFileReader at 0x7f57fc30ff28>

but at this point i get the error:

for chunk in csv_data:
    chunk
ValueError                                Traceback (most recent call last)
<ipython-input-110-7a91d8f960fe> in <module>
----> 1 for chunk in csv_data:
      2     chunk
...
...
/usr/local/lib/python3.7/site-packages/pandas/io/parsers.py in converter(*date_cols)
   3260     def converter(*date_cols):
   3261         if date_parser is None:
-> 3262             strs = parsing._concat_date_cols(date_cols)
   3263 
   3264             try:

pandas/_libs/tslibs/parsing.pyx in pandas._libs.tslibs.parsing._concat_date_cols()

ValueError: not all elements from date_cols are numpy arrays

I want to write the chunk to the database, but I can't find the reason for this error.

EDIT:

The Code runs for years with pandas 0.23.4 and numpy 1.15.2 with current versions of pandas 0.25.2 and numpy 1.17.3 the code doesn't work

i have created a venv with the old versions and there is no error.

Any ideas why I get this error? Can the reason be changes in pandas?

Thanks!

标签: pandascsvpython-3.7chunks

解决方案


推荐阅读