python - 使用有时包含较少项目 (< x) 的 tolist() 函数填充固定数量 (x) 的新列
问题描述
我正在使用 tolist() 将 1 列('modelGreeks')中的 8 项列表拆分为同一数据框中的 8 个新列:
pd.DataFrame(df['modelGreeks'].tolist(), index=df.index)
df[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = pd.DataFrame(df['modelGreeks'].tolist(), index=df.index)
这是我通常在“modelGreeks”列中获得的列表:
(0.2953686167703842, -1.9317880628477724e-14, 1.4648640549124297e-15, 0.0, 6.240571011994176e-13, 1.1840837166645831e-15, -1.4648640549124297e-15, 10.444000244140625)
10 次中有 9 次完美运行。但有时我通过 API 检索的数据并不完美/完整。而不是在“modelGreeks”列中包含 8 个项目的预期列表,它在该字段中提供了一个“无”值,并且我在第二个代码行的代码执行中收到以下错误消息(从逻辑上讲,因为它尝试用只有 1 个可用值:
ValueError: Columns must be same length as key
我正在寻找一个解决方案,无论如何都会创建和填充 8 个新列,例如使用 0 或 NaN 或 None。
希望有人能帮忙。提前感谢您的努力。
以下代码有效:
df1 = pd.DataFrame(columns=['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice','modelGreeks'])
df1['modelGreeks'] = [[None, None, None, None, None, None, None, None], None, None, None, None]
df1[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = df1['modelGreeks'].apply(pd.Series)
它返回:
IV_model 59 Price_model 61 62 63 64 undPrice modelGreeks
0 NaN NaN NaN NaN NaN NaN NaN NaN [None, None, None, None, None, None, None, None]
1 NaN NaN NaN NaN NaN NaN NaN NaN None
2 NaN NaN NaN NaN NaN NaN NaN NaN None
3 NaN NaN NaN NaN NaN NaN NaN NaN None
4 NaN NaN NaN NaN NaN NaN NaN NaN None
这很好。唯一的问题是,在某些时刻,我通过 API 从 Interactive Brokers 收到的数据集只会在列 modelGreeks 的所有行中提供标量 None 值。如果我将其应用于测试用例,那么我会再次收到错误消息(“ValueError:列必须与键长度相同”):
df1 = pd.DataFrame(columns=['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice','modelGreeks'])
df1['modelGreeks'] = [None, None, None, None, None]
df1[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = df1['modelGreeks'].apply(pd.Series)
Traceback (most recent call last):
File "/Users/floris/PycharmProjects/ib_insync/test1.py", line 9, in <module>
df1[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = df1['modelGreeks'].apply(pd.Series)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 3367, in __setitem__
self._setitem_array(key, value)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/pandas/core/frame.py", line 3389, in _setitem_array
raise ValueError('Columns must be same length as key')
ValueError: Columns must be same length as key
在这种情况下,我也希望在 8 列中只看到 NaN 值。
解决方案
不要创建新的 DataFrame,而是将列表列转换为系列:
df[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = df['modelGreeks'].apply(pd.Series)
测试:
df = pd.DataFrame(columns=['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice','modelGreeks'])
df['modelGreeks'] = [[1,2,3,4,5,6,7,8], [1,2,None,4,5,6,7,8], [1,2,3,4,5,6,7], [None], None, [None,None,None,None,None]]
df[['IV_model', 59, 'Price_model', 61, 62, 63, 64, 'undPrice']] = df['modelGreeks'].apply(pd.Series)
输出:
IV_model 59 Price_model ... 64 undPrice modelGreeks
0 1.0 2.0 3.0 ... 7.0 8.0 [1, 2, 3, 4, 5, 6, 7, 8]
1 1.0 2.0 NaN ... 7.0 8.0 [1, 2, None, 4, 5, 6, 7, 8]
2 1.0 2.0 3.0 ... 7.0 NaN [1, 2, 3, 4, 5, 6, 7]
3 NaN NaN NaN ... NaN NaN [None]
4 NaN NaN NaN ... NaN NaN None
5 NaN NaN NaN ... NaN NaN [None, None, None, None, None]
推荐阅读
- javascript - 将被点击元素的道具作为参数传递给函数 - React / D3
- android - 从java中的字符串中删除双引号
- mongodb - MongoExport中的查询范围?
- linq - 如何按照 F# Lint 的建议使用 `id`
- c# - 有人在 try/catch 运算符中使用 Mahapps 对话框吗?
- python - Python 3 的 C/C++ 扩展中的关键字解析
- angular - jsQR无法解析数据矩阵
- pip - 无法在 Ubuntu 16.04 上安装 awscli
- matlab - MATLAB“梯度”函数交换x和y维度?
- java - 无法在 json 数组中打印 json 对象