首页 > 解决方案 > PlotnineError: 'Aesthetics must either be length one, or the same length as the data'

问题描述

I am trying to build a waterfall chart using plotnine. Have 9 groupings (percentiles), so I would like a 3x3 facet_wrap plot.

Below is some sample data and what I want the plot to look like based on 1 of the 9 groupings. I get errors when trying to add more categories and facet_wrap.

Code for 1 grouping and illustration of what I am trying to do:

df = pd.DataFrame({})
df['label'] = ('A','B','C','D','E')
df['percentile'] = (10)*5
df['value'] = (100,80,90,110,110)
df['yStart'] = (0,100,80,90,0)
df['barLabel'] = ('100','-20','+10','+20','110')
df['labelPosition'] = ('105','75','95','115','115')
df['colour'] = ('grey','red','green','green','grey')

p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value',fill='colour'))
    + theme_light(6)
    + geom_segment(size=10)
    + ylab('value')
    + scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)

enter image description here

However, my dataframe looks more like this (ie stacked groups):

df = pd.DataFrame({})
df['label'] = ('A','B','C','D','E','A','B','C','D','E')
df['percentile'] = (10,20)*5
df['value'] = (100,80,90,110,110)*2
df['yStart'] = (0,100,80,90,0)*2
df['barLabel'] = ('100','-20','+10','+20','110')*2
df['labelPosition'] = ('105','75','95','115','115')*2
df['colour'] = ('grey','red','green','green','grey')*2

And when I try:

p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value'))
    + theme_light(6)
    + geom_segment(size=10)
    + ylab('value')
    + facet_wrap('~percentile')
    + scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)

I get the following error:

PlotnineError: 'Aesthetics must either be length one, or the same length as the data'

标签: pythonpython-3.xplotnine

解决方案


The error is because the data that you pass to the x and x_end aesthetics only has 5 observations, but the rest of your data in your second attempt has 10 observations (5 per desired facet). To overcome this, you need to provide data with either one observation (it will be replicated for all data observations) or as many observations as your input DataFrame (10 observations) like below.

x_dat = [i for i in np.arange(0,5,1)]*2       # [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
p = (ggplot(df, aes(x=x_dat, xend=x_dat, y='yStart',yend='value'))
    + theme_light(6)
    + geom_segment(size=10)
    + ylab('value')
    + facet_wrap('~percentile')
    + scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)
p

It's safer to just add this data as a column to your DataFrame so the x locations align even in case the rows aren't in the exact order you want Then you can use the column name and it's printed as the axis label. plot with facets


推荐阅读