首页 > 解决方案 > How to include math symbols in a dplyr tibble character variable?

问题描述

I have data on individuals earning different wage rates and I would like to make a ggplot that includes a separate series for each wage bin.

The below code works, but I was hoping there would be a way to include math symbols in the variables so that the <= turns into the appropriate symbol in the legend (< with a bar underneath). I am trying to avoid defining the legend labels in the code for the plot because I have to repeat this type of plot many times over and if I add new wage bins, the series might inadvertently become mislabeled. I also might want to include other math symbols in the legend objects in the future which are less easy to differentiate from their raw text than the <=.

library(tidyverse)

wage_data <- tibble(
  x=runif(100, 8, 11),
  y=rnorm(100), 
  z=y+rnorm(100, 0, 0.5),
  worker_group=case_when(
    x<9 ~ "Wage<9",
    x>=9 & x<10 ~ "9<=Wage<10",
    x>=10 ~ "Wage>=10"
  )
)
wage_data %>% 
  ggplot(aes(y, z, color=worker_group)) + geom_point()

Perhaps a yet better way to do this would be to store the worker_group variable as a factor where the labels contain the math symbols and I can control the order of the levels. Any idea how to do this? Thanks in advance.

标签: rggplot2dplyrtibble

解决方案


我不使用 ggplot,但这适用于基本图形,也应该适用于 ggplot

lbl <- gsub('^(.*)(?=[<>])', 'paste(\\1)', unique(wage_data$worker_group), perl = TRUE)
# [1] "paste(Wage)>=10"   "paste(9<=Wage)<10" "paste(Wage)<9"

ggplot(wage_data, aes(y, z, color = worker_group)) +
  geom_point() +
  scale_color_manual(values = scales::hue_pal()(3), labels = parse(text = lbl))

在此处输入图像描述


推荐阅读