首页 > 解决方案 > Plotly: How to manually change the legend items when plotting columns?

问题描述

I have the following pandas dataframe with population of two countries during the years:

>>>year   pop1            pop2
0   1   1.000000e+08    1.000000e+08
1   2   9.620000e+07    9.970000e+07
2   3   9.254440e+07    9.940090e+07
3   4   8.902771e+07    9.910270e+07
4   5   8.564466e+07    9.880539e+07

I want to create plot line so the y values will the pop columns:

fig = px.line(data, x="year", y="pop1", title='Population')

fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines')

fig.show()

The results looks like this: enter image description here

My problem here is that the legend shows only one line, and seems like I can't control it (e.g to change it fro mtrace to pop1 and pop 2). I have seen that there is option to use the "color" but seems that is impossible when plotting columns.

My end goal here is to be able to control the legend - to have the column names (pop1 and pop2) as the legend items.

标签: pythonpandaschartsplotlyline

解决方案


Short answer:

To keep the solution close to your original setup, you can do this:

fig = px.line(data, x="year", y="pop1", title='Population')
fig.data[0].name="pop1"
fig.update_traces(showlegend=True)
fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2")

Some details:

The suggestion in the comment form @TeejayBruno will solve your problem. But the approach described there differs fundamentally from the steps you've described. And I suspect that there is a reason why you're first building a figure using

fig = px.line(data, x="year", y="pop1", title='Population')

And then adding new traces using:

fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines')

So I thought I'd shed some light on why the legend is "missing" after the first step, and then how to make sure that "pop1" is included in the legend when you're adding more traces in step 2.


The complete answer:


1. Why is the legend missing for px.line(data, x="year", y="pop1", title='Population')

There's a perfectly good explanation for that. Take a look at the following plot. When px.line only picks up one trace, it decides that a legend is superflous and that the information could be more naturally displayed as the label of the y-axis. And I pretty much agree on the decition the plotly devs have made there:

Figure 1

enter image description here

But this does not as much sense when users decide to build on that figure by adding traces through fig.add_scatter(). And this is the exact probelm you've stumbled upon.

2. How can you fix the legend manually and keep adding traces?

When you use fig = px.line(data, x="year", y=["pop1", "pop2"], title='Population') with multiple y categories, px.line understands that displaying all that information as label names for the y-axis doesn't make much sense anymore, and produces a legend like in the green circle in the figure below. And the same time, the y-axis label is renamed to "value" in the red circle:

enter image description here

And what additionally happens under the hood, is that the data properties of the fig object are named "pop1" and "pop2":

<bound method BaseFigure.show of Figure({
    'data': [{'hovertemplate': 'variable=pop1<br>year=%{x}<br>value=%{y}<extra></extra>',
              'legendgroup': 'pop1',
              'line': {'color': '#636efa', 'dash': 'solid'},
              'mode': 'lines',
              'name': 'pop1',
              'orientation': 'v',
              'showlegend': True,
              'type': 'scatter',
              'x': array([1, 2, 3, 4, 5], dtype=int64),
              'xaxis': 'x',
              'y': array([1.000000e+08, 9.620000e+07, 9.254440e+07, 8.902771e+07, 8.564466e+07]),
              'yaxis': 'y'},
             {'hovertemplate': 'variable=pop2<br>year=%{x}<br>value=%{y}<extra></extra>',
              'legendgroup': 'pop2',
              'line': {'color': '#EF553B', 'dash': 'solid'},
              'mode': 'lines',
              'name': 'pop2',
              'orientation': 'v',

And therein lies the solution to how you can adjust the legend properties to your needs:

1. Make sure that 'name': 'pop1' for the first trace using fig.data[0].name="pop1".

2. Set the figure to displays trace names in the legend with fig.update_traces(showlegend=True) (figure 2.1).

3. Include names for all consecutive traces using fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2") (figure 2.2).

4. Rename the y-axis label to whatever you'd like using, for example, fig.update_yaxes(title=dict(text='People')).

Figure 2.1

enter image description here

Figure 2.2

enter image description here

Complete code:

import plotly.graph_objs as go
import plotly.express as px
import pandas as pd

data = pd.DataFrame({'year': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
                     'pop1': {0: 100000000.0,
                      1: 96200000.0,
                      2: 92544400.0,
                      3: 89027710.0,
                      4: 85644660.0},
                     'pop2': {0: 100000000.0,
                      1: 99700000.0,
                      2: 99400900.0,
                      3: 99102700.0,
                      4: 98805390.0}})

fig = px.line(data, x="year", y="pop1", title='Population')
#fig = px.line(data, x="year", y=["pop1", "pop2"], title='Population')
fig.data[0].name="pop1"
fig.update_traces(showlegend=True)

fig.add_scatter(x=data['year'], y=data['pop2'], mode='lines', name = "pop2")#
fig.update_yaxes(title=dict(text='People'))
fig.show()

推荐阅读