python - 创建数据框时在循环期间遇到错误
问题描述
我目前正在研究世界宗教数据,并希望组织一个数据框,它给我['国家名称','国家最信奉的宗教','信仰人数'],但是,我遇到了一条错误消息。下面是我的代码。
'''
import pandas as pd
import geopandas
import matplotlib.pyplot as plt
import mapclassify
import pyproj
from pyproj import Proj
from matplotlib.patches import Ellipse, Polygon
import datetime
import numpy as np
countries = geopandas.read_file('../data/world/ne_admin_0_countries.geojson')
hse_size = pd.read_csv('../data/world/houseshold_size_2018.csv', skiprows=4, header=0)
rlgn_adhere = pd.read_csv('../data/world/WRP_national.csv', header=0)
religion_cat = []
rlgn_adhere_top = list(rlgn_adhere.columns.values)
for i in range(3,38):
religion_cat.append(rlgn_adhere_top[i])
country_rlgn_adhere = rlgn_adhere.groupby(['name'], as_index=False)
lastest_rlgn_adhere = country_rlgn_adhere['year'].max()
country_latest_adhere = lastest_rlgn_adhere.merge(rlgn_adhere, on=['year', 'name'], how='left')
col_latest_rlgn_pop = ['year', 'name'] + religion_cat
latest_rlgn_pop = country_latest_adhere[col_latest_rlgn_pop]
pop_rlgn = ''
pop_rlgn_cat_num = pd.DataFrame(columns=['name', 'Country Most Adhered Religion', 'Number of Adherence'])
for x in latest_rlgn_pop['name']:
maximum = 0
a = pd.DataFrame()
a = latest_rlgn_pop[latest_rlgn_pop['name'] == x]
for y in religion_cat:
b = pd.Series([])
b = a[str(y)]
print(b[0])
if np.invert(np.isnan(b[0])):
b = int(b[0])
if (b > maximum):
maximum = b
pop_rlgn = y
a.insert(0,"Number of Adherence", maximum)
a.insert(0,"Country Most Adhered Religion", pop_rlgn)
pop_rlgn_cat_num = pop_rlgn_cat_num.append(a[['name', 'Number of Adherence', 'Country Most Adhered Religion']],sort=True)
latest_rlgn_pop = pd.merge(latest_rlgn_pop, pop_rlgn_cat_num, on=['name'])
country_hse_size = hse_size.groupby(['Country or area'], as_index=False)
latest_size = country_hse_size['Reference date (dd/mm/yyyy)'].max()
avg_hse_size = hse_size[['Country or area', 'Reference date (dd/mm/yyyy)', 'Average household size (number of members)']]
country_latest_size = latest_size.merge(avg_hse_size, on=['Country or area','Reference date (dd/mm/yyyy)'], how='left')
country_latest_size = country_latest_size.dropna()
country_latest_size_unique = country_latest_size.groupby(['Country or area'], as_index=False)
country_latest_size_unique = country_latest_size_unique['Average household size (number of members)'].mean()
countries = countries[['ADMIN', 'geometry']]
countries.columns = ['Country or area', 'geometry']
countries_household_size = countries.merge(country_latest_size_unique, on='Country or area', how='left')
'''
在我的嵌套 for 循环中,当第二次运行第 36 行 'print(b[0])' 时,控制台中出现了一条错误消息:
Traceback (most recent call last):
File "C:\Users\USER\Anaconda3\envs\MaCT\lib\site-packages\pandas\core\series.py", line 1068, in __getitem__
result = self.index.get_value(self, key)
File "C:\Users\USER\Anaconda3\envs\MaCT\lib\site-packages\pandas\core\indexes\base.py", line 4730, in get_value
return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
File "pandas\_libs\index.pyx", line 80, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 88, in pandas._libs.index.IndexEngine.get_value
File "pandas\_libs\index.pyx", line 131, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 992, in pandas._libs.hashtable.Int64HashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 998, in pandas._libs.hashtable.Int64HashTable.get_item
KeyError: 0
我还没有找到有关此错误消息的线索,有人可以帮我吗?谢谢。
这是我一直在使用的数据集的链接: Country_Household_Religions_Dataset
解决方案
推荐阅读
- sql - GROUPBY 多对多与自身
- tensorflow - 用于二进制分类的 DCNN 收敛到 50%/50%
- python - 如何将 Julia 标准输出重定向到 IPython 控制台?
- inno-setup - Inno Setup Preprocessor 可以用于构建重复的自定义消息集吗?
- multithreading - 如何从结构向量中生成线程以从向量中的每个结构运行 impl fn?
- mql4 - MQL4 如何检查最后一个未平仓头寸是否盈利
- jquery - 如何在jquery中通过索引号更改标题
- java - Apache Spark:Java RDD中特定字段的记录计数
- scala - 具有依赖注入的数据库操作的可重用代码
- java - 如何使 java App 成为 Windows 上的默认邮件客户端?