首页 > 解决方案 > 将 Pandas 数据框传递给 matplotlib 标记和文本

问题描述

我正在努力创建基于 Pandas 数据框的点散点图,每个点都有单独的标签、标记和颜色。下面的代码有两个问题,在评论中进行了描述。出于某种原因,matplotlib 不喜欢我传递标记和标签的方式,但颜色列没有问题。

应该如何传递值?

import pandas as pd
import math
import matplotlib.pyplot as plt

# Read CSV
# Distance,Bearing,Depth,Name,Description,Done,Color,Marker
# 630,90,250,Foo,Bar,FALSE,#777,o
df = pd.read_csv('data.csv')

OFFSET = 100

# Create a few columns from calculations
# This code works as expected
df['h'] = round((df['Distance']**2 - (df['Depth']-OFFSET)**2).apply(math.sqrt)).astype(int)
df['dir'] = (df['Bearing'] - 180) % 360
df['x'] = (df['dir'].apply(math.radians).apply(math.sin) * df['h']).astype(float)
df['y'] = (df['dir'].apply(math.radians).apply(math.cos) * df['h']).astype(float)

# Check that everything looks OK (it does)
print(df.head())

# Problem #1: This works...
plt.scatter(x=df['x'], y=df['y'], c=df['Color'], marker='o')

# ...but this doesn't
# (TypeError: unhashable type: 'Series')
plt.scatter(x=df['x'], y=df['y'], c=df['Color'], marker=df['Marker'])

# Problem #2: Error when adding labels
# (ValueError: The truth value of a Series is ambiguous,
# Use a.empty, a.bool(), a.item(), a.any() or a.all().)
plt.text(x=df['x'], y=df['y'], s=df['Name'])

plt.show()

但是,以下工作按预期工作,但似乎是一个很大的弯路:

x = df['x'].tolist()
y = df['y'].tolist()
color = df['Color'].tolist()
marker = df['Marker'].tolist()
label = df['Name'].tolist()

for x, y, c, m, l in zip(x, y, color, marker, label):
    plt.scatter(x, y, c=c, marker=m)
    plt.text(x, y, l)

标签: pythonpandasmatplotlib

解决方案


推荐阅读