首页 > 解决方案 > 读取csv时将行号作为索引

问题描述

我有一个 csv 文件,如下所示:

30,60,14.3,53.6,0.71,403,0
30,60,15.3,54.9,0.72,403,0
30,60,16.5,56.2,0.73,403,0
30,60,17.9,57.5,0.74,403,0

没有标题,只有数据。列是

colNames = {
        'doa_in1': np.float64, 'doa_in2': np.float64,
        'doa_est1': np.float64, 'doa_est2': np.float64, 
        'rho': np.float64,
        'seed': np.int32, 'matl_chan':np.int32
        }

我阅读了csv:

tmp_df = pd.read_csv(
                    io.BytesIO(tmp_csv), encoding='utf8',
                    header=None,
                    names=colNames.keys(), dtype=colNames,
                    converters={
                                'matl_chan': lambda x: bool(int(x))
                               }
                    )

这给出了一个警告,因为我给出了两种可能的转换matl_chan,但这只是一个警告,python 将只使用其中的内容converters(即 lambda 函数)

我希望每行都有一个数字或唯一的东西作为索引。

那是因为,然后我用这个函数处理 tmp_df

def remove_lines(df):
    THRES = 50
    THRES_angle = 10  # degrees
    is_converging = True
    for idx, row in df.iterrows():
        if idx == 0:
            is_converging = False
        # check if MUSIC started converging
        if abs(row['doa_est1']-row['doa_in1']) < THRES_angle:
            if abs(row['doa_est2']-row['doa_in2']) < THRES_angle:
                is_converging = True
        # calc error
        err = abs(row['doa_est1']- row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
        if err > THRES and is_converging:
            df=df.drop(idx) 
    return df

但是,所有行的索引为 30,因此当我收到此错误时,该函数不会删除任何内容:

KeyError: '[30] not found in axis'

完整的堆栈跟踪是

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-143-b61c0402f9d7> in <module>
----> 1 df=get_dataframe()

<ipython-input-121-b76aab8b17ee> in get_dataframe()
     24                 continue
     25 
---> 26             tmp_df_sanitized = remove_lines(tmp_df)
     27             all_dataframes.append(tmp_df_sanitized)
     28 

<ipython-input-142-31019390251a> in remove_lines(df)
     62         err = abs(row['doa_est1']-row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
     63         if err > THRES and is_converging:
---> 64             df=df.drop(idx)
     65             print("dropped {}".format(idx))
     66     return df

/usr/lib/python3.7/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3938                                            index=index, columns=columns,
   3939                                            level=level, inplace=inplace,
-> 3940                                            errors=errors)
   3941 
   3942     @rewrite_axis_style_signature('mapper', [('copy', True),

/usr/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
   3778         for axis, labels in axes.items():
   3779             if labels is not None:
-> 3780                 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
   3781 
   3782         if inplace:

/usr/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
   3810                 new_axis = axis.drop(labels, level=level, errors=errors)
   3811             else:
-> 3812                 new_axis = axis.drop(labels, errors=errors)
   3813             result = self.reindex(**{axis_name: new_axis})
   3814 

/usr/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
   4962         if mask.any():
   4963             if errors != 'ignore':
-> 4964                 raise KeyError(
   4965                     '{} not found in axis'.format(labels[mask]))
   4966             indexer = indexer[~mask]

KeyError: '[30] not found in axis'

有没有人有解决方案?

编辑:为了更清楚,我想拥有[0,1,2,3]我上面放的四行的行索引

标签: pythonpython-3.xpandas

解决方案


推荐阅读