首页 > 解决方案 > 如何在将 pandas 用于 csv 文件时更改不同数据类型的标记样式

问题描述

我有一个 csv 文件,其中包含 100 GB 地点的数据,其中包含名称、人口、类型(城镇或城市)、纬度和经度的列。我已经将它们绘制在经度与纬度的地图上,标记大小与人口成正比,颜色取决于国家。我正在努力寻找改变标记样式的方法。理想情况下,我希望城镇有 ^,城市有 v。到目前为止,这是我的代码。

# imports
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches

# import data file
# select data columns needed
data = pd.read_csv('GBplaces.csv', sep = ',', usecols = (0,1,2,3,4))
# name data columns
data.columns = ('Place','Type','Population','Latitude','Longitude')#



# make markers for towns and cities from different nations different colours
# Scotland in blue
data.loc[(data['Place'] == 'Aberdeen') | (data['Place'] == 'Dundee') | 
(data['Place'] == 'Glasgow') 
| (data['Place'] == 'Edinburgh'),'Colour'] = 'b'

# Wales in black
data.loc[(data['Place'] == 'Swansea') | (data['Place'] == 'Cardiff') | 
(data['Place'] == 'Newport'),'Colour'] = 'black'

# England in red
data.loc[(data['Place'] != 'Aberdeen') & (data['Place'] != 'Dundee') 
& (data['Place'] != 'Glasgow') & (data['Place'] != 'Edinburgh') 
& (data['Place'] != 'Swansea') & (data['Place'] != 'Cardiff') & 
(data['Place'] != 'Newport'),'Colour'] = 'r'

# legend created for colours for each nation
red_marker = mpatches.Patch(color='r',label='England')
blue_marker = mpatches.Patch(color='b', label='Scotland')
black_marker = mpatches.Patch(color='black', label='Wales')
legend = plt.legend(handles=[red_marker, blue_marker, black_marker])

# colour added to background
ax = plt.gca()
ax.patch.set_facecolor('#CCFFFF')

# make point size proportional to population
area = data['Population']/100000

plt.scatter(data['Longitude'], data['Latitude'], c = data['Colour'], s = 
area, )

到目前为止,我已经尝试以与更改颜色相同的方式使用标记样式,但这会导致图表为空。任何帮助将非常感激。

标签: pythonpandasmatplotlibscatter-plot

解决方案


首先是一些虚拟数据:

df = pd.DataFrame(data={
    'Place': ['Scotland', 'Scotland', 'England', 'England', 'Wales', 'Wales'], 
    'x': [100, 90, 80, 70, 60, 50], 
    'y': [10, 20, 30, 40, 50, 60]
})

分组Place并列出一个列表,markers然后循环遍历它。在你的情况下Place将是城市或城镇。

from itertools import cycle

ax = plt.gca()
ax.patch.set_facecolor('#FFFFFF')

places = df.groupby('Place')

markers = ['o', '1', ',']

legend_labels = []

for (name, place), marker in zip(places, cycle(markers)):

    ax.scatter(place.x, place.y, marker=marker)

    legend_labels.append(name)

ax.legend(labels=legend_labels)

plt.show()

在此处输入图像描述


推荐阅读