python - 读取csv时将行号作为索引
问题描述
我有一个 csv 文件,如下所示:
30,60,14.3,53.6,0.71,403,0
30,60,15.3,54.9,0.72,403,0
30,60,16.5,56.2,0.73,403,0
30,60,17.9,57.5,0.74,403,0
没有标题,只有数据。列是
colNames = {
'doa_in1': np.float64, 'doa_in2': np.float64,
'doa_est1': np.float64, 'doa_est2': np.float64,
'rho': np.float64,
'seed': np.int32, 'matl_chan':np.int32
}
我阅读了csv:
tmp_df = pd.read_csv(
io.BytesIO(tmp_csv), encoding='utf8',
header=None,
names=colNames.keys(), dtype=colNames,
converters={
'matl_chan': lambda x: bool(int(x))
}
)
这给出了一个警告,因为我给出了两种可能的转换matl_chan
,但这只是一个警告,python 将只使用其中的内容converters
(即 lambda 函数)
我希望每行都有一个数字或唯一的东西作为索引。
那是因为,然后我用这个函数处理 tmp_df
def remove_lines(df):
THRES = 50
THRES_angle = 10 # degrees
is_converging = True
for idx, row in df.iterrows():
if idx == 0:
is_converging = False
# check if MUSIC started converging
if abs(row['doa_est1']-row['doa_in1']) < THRES_angle:
if abs(row['doa_est2']-row['doa_in2']) < THRES_angle:
is_converging = True
# calc error
err = abs(row['doa_est1']- row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
if err > THRES and is_converging:
df=df.drop(idx)
return df
但是,所有行的索引为 30,因此当我收到此错误时,该函数不会删除任何内容:
KeyError: '[30] not found in axis'
完整的堆栈跟踪是
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
<ipython-input-143-b61c0402f9d7> in <module>
----> 1 df=get_dataframe()
<ipython-input-121-b76aab8b17ee> in get_dataframe()
24 continue
25
---> 26 tmp_df_sanitized = remove_lines(tmp_df)
27 all_dataframes.append(tmp_df_sanitized)
28
<ipython-input-142-31019390251a> in remove_lines(df)
62 err = abs(row['doa_est1']-row['doa_in1'])+abs(row['doa_est2']-row['doa_in2'])
63 if err > THRES and is_converging:
---> 64 df=df.drop(idx)
65 print("dropped {}".format(idx))
66 return df
/usr/lib/python3.7/site-packages/pandas/core/frame.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3938 index=index, columns=columns,
3939 level=level, inplace=inplace,
-> 3940 errors=errors)
3941
3942 @rewrite_axis_style_signature('mapper', [('copy', True),
/usr/lib/python3.7/site-packages/pandas/core/generic.py in drop(self, labels, axis, index, columns, level, inplace, errors)
3778 for axis, labels in axes.items():
3779 if labels is not None:
-> 3780 obj = obj._drop_axis(labels, axis, level=level, errors=errors)
3781
3782 if inplace:
/usr/lib/python3.7/site-packages/pandas/core/generic.py in _drop_axis(self, labels, axis, level, errors)
3810 new_axis = axis.drop(labels, level=level, errors=errors)
3811 else:
-> 3812 new_axis = axis.drop(labels, errors=errors)
3813 result = self.reindex(**{axis_name: new_axis})
3814
/usr/lib/python3.7/site-packages/pandas/core/indexes/base.py in drop(self, labels, errors)
4962 if mask.any():
4963 if errors != 'ignore':
-> 4964 raise KeyError(
4965 '{} not found in axis'.format(labels[mask]))
4966 indexer = indexer[~mask]
KeyError: '[30] not found in axis'
有没有人有解决方案?
编辑:为了更清楚,我想拥有[0,1,2,3]
我上面放的四行的行索引
解决方案
推荐阅读
- asp.net-core - IdentityServer4:客户端+客户登录的API端点访问
- python - 使用数据框和列表之间的重叠字数创建一个新的数据框列
- laravel - 使用社交帐户登录时询问用户密码 - Laravel Socialite
- ios - Flutter & iOS:未能发布请求意外字符
- javascript - 相对于每个视觉角度的像素数,在屏幕上重新定位刺激
- php - Symfony Messenger:是否可以在最后一次重试时不抛出异常?
- node.js - 从路径文件中提取语言环境的工具
- python - 如何在 Python 中只暂停脚本的一部分
- snmp - centos 8 上的 NET-SNMP 配置问题
- visual-c++ - 为什么 DECLARE_MESSAGE_MAP 受保护但消息处理程序是公开的?