python - PlotnineError: 'Aesthetics must either be length one, or the same length as the data'
问题描述
I am trying to build a waterfall chart using plotnine. Have 9 groupings (percentiles
), so I would like a 3x3 facet_wrap
plot.
Below is some sample data and what I want the plot to look like based on 1 of the 9 groupings. I get errors when trying to add more categories and facet_wrap
.
Code for 1 grouping and illustration of what I am trying to do:
df = pd.DataFrame({})
df['label'] = ('A','B','C','D','E')
df['percentile'] = (10)*5
df['value'] = (100,80,90,110,110)
df['yStart'] = (0,100,80,90,0)
df['barLabel'] = ('100','-20','+10','+20','110')
df['labelPosition'] = ('105','75','95','115','115')
df['colour'] = ('grey','red','green','green','grey')
p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value',fill='colour'))
+ theme_light(6)
+ geom_segment(size=10)
+ ylab('value')
+ scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)
However, my dataframe looks more like this (ie stacked groups):
df = pd.DataFrame({})
df['label'] = ('A','B','C','D','E','A','B','C','D','E')
df['percentile'] = (10,20)*5
df['value'] = (100,80,90,110,110)*2
df['yStart'] = (0,100,80,90,0)*2
df['barLabel'] = ('100','-20','+10','+20','110')*2
df['labelPosition'] = ('105','75','95','115','115')*2
df['colour'] = ('grey','red','green','green','grey')*2
And when I try:
p = (ggplot(df, aes(x=np.arange(0,5,1), xend=np.arange(0,5,1), y='yStart',yend='value'))
+ theme_light(6)
+ geom_segment(size=10)
+ ylab('value')
+ facet_wrap('~percentile')
+ scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)
I get the following error:
PlotnineError: 'Aesthetics must either be length one, or the same length as the data'
解决方案
The error is because the data that you pass to the x and x_end aesthetics only has 5 observations, but the rest of your data in your second attempt has 10 observations (5 per desired facet). To overcome this, you need to provide data with either one observation (it will be replicated for all data observations) or as many observations as your input DataFrame (10 observations) like below.
x_dat = [i for i in np.arange(0,5,1)]*2 # [0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
p = (ggplot(df, aes(x=x_dat, xend=x_dat, y='yStart',yend='value'))
+ theme_light(6)
+ geom_segment(size=10)
+ ylab('value')
+ facet_wrap('~percentile')
+ scale_y_continuous(breaks=np.arange(0,141,20), limits=[0,140], expand=(0,0))
)
p
It's safer to just add this data as a column to your DataFrame so the x locations align even in case the rows aren't in the exact order you want Then you can use the column name and it's printed as the axis label.
推荐阅读
- google-chrome-extension - 当默认值未定义时,Chrome 存储返回未定义
- r - 使用 dplyr 或 Purrr 从 2 个数据帧中提取匹配数据
- javascript - 如何在 Express.JS 中的“app.post”之后发送回客户端
- sql - (SQLITE) 基于累积范围的 SUM
- uml - 如何将多个对象输入边表示为一个 Action 节点并使图表具有可读性?
- php - 否定的空合并运算符(双问号 - ??)
- swift - 循环通过 CFURL 的 CFArray
- jq - 使用参数在 fzf 预览中运行 jq
- php - 带有 FFMPEG 的 Laravel Process() 对任何甚至很大的视频都会超时
- python - 子文件夹具有相同名称时,Airflow Packaged Dags(压缩)冲突