python - 有没有办法循环遍历 Pandas 数据框?
问题描述
所以我正在使用 csv 文件并想从中绘制一些图表。但是,我找不到直接从 Dataframe 获取所需信息的方法。代码: import numpy as np import pandas as pd
path_main = '850566403_T_ONTIME.csv'
df1 = pd.read_csv(path_main, header=0, sep=",")
#remove columns and rows with nan
df1.dropna(axis=1, how='all', inplace=True)
df = df1.dropna(subset=['ARR_DELAY_NEW'])
所述数据框的输出:
YEAR MONTH AIRLINE_ID DEST_AIRPORT_ID ARR_DELAY_NEW
0 2015 2 19805 12892 0.0
2 2015 2 19805 12892 0.0
3 2015 2 19805 12892 0.0
4 2015 2 19805 12892 0.0
5 2015 2 19805 12892 0.0
... ... ... ... ...
429186 2015 2 19393 14107 0.0
429187 2015 2 19393 14107 35.0
429188 2015 2 19393 14679 99.0
429189 2015 2 19393 14679 23.0
429190 2015 2 19393 14679 20.0
[407663 rows x 5 columns]
我想知道是否有办法让我建立字典。拥有我的航空公司 ID 的键(重复多次),键的值是每个不同 AIRLINE_ID 的“ARR_DELAY_NEW”的平均值。它看起来像这样:
d = {19805:average1; 19393:average2}
解决方案
尝试这个:
d = {}
airline_id = df.at[0,'AIRLINE_ID']
total = 0
count = 0
for index, row in df.iterrows():
if df.at[index,'AIRLINE_ID'] == airline_id:
total += df.at[index,'ARR_DELAY_NEW']
count += 1
else:
d[airline_id] = total / count
airline_id = df.at[index,'AIRLINE_ID']
total = 0
count = 0
last_id = df.at[len(df)-1, 'AIRLINE_ID']
d[last_id] = total / count
推荐阅读
- python - 如何在后台运行python脚本?
- python - 将多个值从数据框中转换为方程式以返回结果
- javascript - 有没有更好的方法来避免这种多维数据的重复?
- ruby-on-rails - Rails:如何使用 en.yml 文件中的句号 (.) 标点符号?
- python - 编写一个名为 get_numbers() 的函数,它返回一个整数列表
- caching - Why is two lines that differ in their address by precisely 65,536 bytes cannot be stored in the cache at the same?
- svg - SVG Gradients with ObjectBoundingBox units and gradientTransform with rotation
- windows - Electron openItem 打开相同路径的新窗口
- javascript - Javascript: sort list of object, by date string property, where date can be undefined
- jakarta-ee - Map a different route by GET parameter in JAX-RS