python - Understanding Python Pandas dataframes
问题描述
I am learning Pandas and have been facing difficulty understanding the pivot tables. Below is the sample program that I am running.
import pandas as pd
df = pd.read_csv('/Users/xxx/Desktop/df.csv')
print(df)
df = df.pivot_table(index='__timestamp', columns=[], values=['passed_count', 'failed_count'])
print(df)
And the programs prints below outputs-
__timestamp failed_count passed_count Unnamed: 3
0 27/05/18 0.019417 0.980583
1 03/06/18 0.427136 0.839196
2 10/06/18 0.839416 0.854015
3 17/06/18 0.403846 0.913462
4 24/06/18 1.429688 0.757812
5 01/07/18 6.781457 0.701987
6 08/07/18 0.324561 0.929825
7 15/07/18 0.295082 0.970492
8 22/07/18 0.849802 0.960474
9 29/07/18 0.673333 0.923333
10 05/08/18 0.276657 0.919308
11 12/08/18 0.242105 0.821053
12 19/08/18 0.176471 0.976471
passed_count
__timestamp
01/07/18 0.701987
03/06/18 0.839196
05/08/18 0.919308
08/07/18 0.929825
10/06/18 0.854015
12/08/18 0.821053
15/07/18 0.970492
17/06/18 0.913462
19/08/18 0.976471
22/07/18 0.960474
24/06/18 0.757812
27/05/18 0.980583
29/07/18 0.923333
I am not able to understand the absence of third column after doing the pivot_table(). Is it OK to give multiple values like I did above? What is the significance of the value option that is provided?
Edit:
As asked in the comments-
CSV file contents are-
__timestamp,failed_count,passed_count,
27/05/18,0.019417 ,0.980583,
03/06/18,0.427136 ,0.839196,
10/06/18,0.839416 ,0.854015,
17/06/18,0.403846 ,0.913462,
24/06/18,1.429688 ,0.757812,
01/07/18,6.781457 ,0.701987,
08/07/18,0.324561 ,0.929825,
15/07/18,0.295082 ,0.970492,
22/07/18,0.849802 ,0.960474,
29/07/18,0.673333 ,0.923333,
05/08/18,0.276657 ,0.919308,
12/08/18,0.242105 ,0.821053,
19/08/18,0.176471 ,0.976471,
Output of df.head(), immediately after reading the CSV is
__timestamp failed_count passed_count Unnamed: 3
0 27/05/18 0.019417 0.980583
1 03/06/18 0.427136 0.839196
2 10/06/18 0.839416 0.854015
3 17/06/18 0.403846 0.913462
4 24/06/18 1.429688 0.757812
解决方案
正如我们在评论中发现的那样,pandas 的pivot_table
函数会默默地忽略值列表中的任何非数字(在这种情况下str
)列。并且该failed_count
专栏被如此解释。
推荐阅读
- c# - 将安全组添加到安全组失败并出现 Request_BadRequest
- node.js - 从 AWS Lambda 将数据保存到 Postgres
- javascript - 如何从 ng-content 中的内部子项中选择内容
- c# - 如何使用 Linq 的 Select 方法并将每个实体传递给 automapper
- python - 全新安装时缺少 xlwings api
- python - 告诉模板目录的位置时在 Django settings.py 中使用撇号
- python - Pyephem 海拔似乎是错误的
- google-chrome - NET::ERR_CERT_AUTHORITY_INVALID 访问 google.com(仅在 chrome 中)
- node.js - 查询事件数组以查找我添加的事件是否重叠(Express/Mongoose)
- sql - 在处理列表时更好地替代“Where”子句中的 SQL“或”语句?