首页 > 解决方案 > 由于 csv 文件中缺少标签而导致 Pandas loc 错误

问题描述

我在这里查看 Choropleth 教程。

当我尝试运行它时,我收到以下行错误df = df.ix[iso3_codes].dropna()

AttributeError: 'DataFrame' object has no attribute 'ix'

它似乎ix已在 Pandas 中被弃用。

然后我将行更改为df = df.loc[iso3_codes].dropna() 但收到此错误:

KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported

我该如何解决这个问题?

附加信息

我试过df = df.loc[:, iso3_codes].dropna()了,但它给了我这个错误:

KeyError: "None of [Index(['AND', 'ARE', 'AFG', 'ATG',.....'ANT'],\n      dtype='object', length=252)] are in the [columns]"

也许loc对于过滤掉非国家和缺失值没有用。我现在该怎么做?

完整代码

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import slug

from geonamescache import GeonamesCache
from matplotlib.patches import Polygon
from matplotlib.collections import PatchCollection
from mpl_toolkits.basemap import Basemap

filename = 'csv/API_AG.LND.FRST.ZS_DS2_en_csv_v2_988532/API_AG.LND.FRST.ZS_DS2_en_csv_v2_988532.csv'
shapefile = 'shapes/countries/countries/ne_10m_admin_0_countries_lakes'
num_colors = 9
year = '2012'
cols = ['Country Name', 'Country Code', year]
title = f'Forest area as percentage of land area in {format(year)}'
imgfile = f'img{slug.slug(title)}.png'

description = '''
Forest area is land under natural or planted strands of trees of at least 5 meters in situ, whether productive or not,
and excludes tree strands in agricultural production systems(for example, in fruit plantations and agroforestry systems)
and trees in urban parks and gardens. Countries without data are shown in grey.
Data: World Bank - worldbank.org | Author: Ramiro Gómez - ramiro.org
'''.strip()

gc = GeonamesCache()
iso3_codes = list(gc.get_dataset_by_key(gc.get_countries(), 'iso3').keys())

df = pd.read_csv(filename, skiprows=4, usecols=cols)
df.set_index('Country Code', inplace=True)
df = df.loc[:, iso3_codes].dropna()     # filter out non-countries and missing values

values = df[year]
cm = plt.get_cmap('Greens')
scheme = [cm(i / num_colors) for i in range(num_colors)]
bins = np.linspace(values.min(), values.max(), num_colors)
df['bin'] = np.digitize(values, bins) - 1
df.sort_values('bin', ascending=False).head(10)


df['bin'] = np.digitize(values, bins) - 1
df.sort_values('bin', ascending=False).head(10)

print(f'Available Styles: {plt.style.available}')
mpl.style.use('map')
fig = plt.figure(figsize=(22, 12))

ax = fig.add_subplot(111, facecolor='w', frame_on=False)
fig.suptitle(f'Forest area as percentage of land area in {year}', fontsize=30, y=0.95)

m = Basemap(lon_0=0, projection='robin')
m.drawmapboundary(color='w')

m.readshapefile(shapefile, 'units', color='#444444', linewidth=0.2)
for info, shape in zip(m.units_info, m.units):
    iso3 = info['ADM0_A3']
    if iso3 not in df.index:
        color = '#dddddd'
    else:
        color = scheme[df.loc[iso3]['bin']]

    patches = [Polygon(np.array(shape), True)]
    pc = PatchCollection(patches)
    pc._set_facecolor(color)
    ax.add_collection(pc)

# Cover up Antartica so legend can be placed over it
ax.axhspan(0, 1000 * 1800, facecolor='w', edgecolor='w', zorder=2)

# Draw color legend
ax_legend = fig.add_axes([0.35, 0.14, 0.3, 0.3], zorder=3)
cmap = mpl.colors.ListedColormap(scheme)
cb = mpl.colorbar.ColorbarBase(ax_legend, cmap=cmap, ticks=bins, boundaries=bins, orientation='horizontal')
cb.ax.set_xticklabels([str(round(i, 1)) for i in bins])

# Set the map footer
plt.annotate(description, xy=(-0.8, -3.2), size=14, xycoords=0.2)
plt.show()

完全错误

/Users/me/.pyenv/versions/3.8.1/bin/python /Users/me/PycharmProjects/ChoroplethMap/chloropleth_map.py
Traceback (most recent call last):
  File "/Users/me/PycharmProjects/ChoroplethMap/chloropleth_map.py", line 33, in <module>
    df = df.loc[:, iso3_codes].dropna()     # filter out non-countries and missing values
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1762, in __getitem__
    return self._getitem_tuple(key)
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1289, in _getitem_tuple
    retval = getattr(retval, self.name)._getitem_axis(key, axis=i)
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1954, in _getitem_axis
    return self._getitem_iterable(key, axis=axis)
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1595, in _getitem_iterable
    keyarr, indexer = self._get_listlike_indexer(key, axis, raise_missing=False)
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1552, in _get_listlike_indexer
    self._validate_read_indexer(
  File "/Users/me/.pyenv/versions/3.8.1/lib/python3.8/site-packages/pandas/core/indexing.py", line 1640, in _validate_read_indexer
    raise KeyError(f"None of [{key}] are in the [{axis_name}]")
KeyError: "None of [Index(['AND', 'ARE', 'AFG', 'ATG', 'AIA', 'ALB', 'ARM', 'AGO', 'ATA', 'ARG',\n       ...\n       'VUT', 'WLF', 'WSM', 'YEM', 'MYT', 'ZAF', 'ZMB', 'ZWE', 'SCG', 'ANT'],\n      dtype='object', length=252)] are in the [columns]"

标签: pythonpandas

解决方案


代替

df = df.loc[:, iso3_codes].dropna()

df = df.reindex(index=iso3_codes).dropna()

推荐阅读