首页 > 解决方案 > 如何在ggplot r中对一个因子内的值进行排序/排序?

问题描述

我正在使用来自 tidytuesday 的数据集,并尝试对每个因素中的值进行排序

例如,在下图中,我希望Years每个城市的值按升序排列(从 2012 年到 2021 年开始的年份)。

我怎样才能订购它们?里面有fct_()订单吗?

在此处输入图像描述

df总结

str(transit_cost)

output:
tibble [537 x 21] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
 $ e               : num [1:537] 7136 7137 7138 7139 7144 ...
 $ country         : Factor w/ 56 levels "Argentina","Australia",..: 9 9 9 9 9 30 9 54 54 54 ...
 $ city            : Factor w/ 140 levels "Ad Dammam","Ahmadabad",..: 131 128 128 128 128 3 80 107 69 69 ...
 $ line            : Factor w/ 366 levels "1995-98 program",..: 15 354 321 286 361 282 14 342 306 305 ...
 $ start_year      : num [1:537] 2020 2009 2020 2020 2020 ...
 $ end_year        : num [1:537] 2025 2017 2030 2030 2030 ...
 $ rr              : chr [1:537] "Not Railroad" "Not Railroad" "Not Railroad" "Not Railroad" ...
 $ length          : num [1:537] 5.7 8.6 7.8 15.5 7.4 9.7 5.8 5.1 4.2 4.2 ...
 $ tunnel_per      : num [1:537] 0.877 1 1 0.568 1 ...
 $ tunnel          : num [1:537] 5 8.6 7.8 8.8 7.4 7.1 5.8 5.1 4.2 4.2 ...
 $ stations        : num [1:537] 6 6 3 15 6 8 5 2 2 2 ...
 $ source1         : chr [1:537] "Plan" "Media" "Wiki" "Plan" ...
 $ cost            : num [1:537] 2830 3200 5500 8573 5600 ...
 $ currency        : chr [1:537] "CAD" "CAD" "CAD" "CAD" ...
 $ year            : num [1:537] 2018 2013 2018 2019 2020 ...
 $ ppp_rate        : num [1:537] 0.84 0.81 0.84 0.84 0.84 1.3 0.84 1 1 1 ...
 $ real_cost       : num [1:537] 2377 2592 4620 7201 4704 ...
 $ cost_km_millions: num [1:537] 417 301 592 465 636 ...
 $ source2         : chr [1:537] "Media" "Media" "Media" "Plan" ...
 $ reference       : chr [1:537] "https://www.translink.ca/Plans-and-Projects/Rapid-Transit-Projects/Broadway-Subway-Project.aspx" "https://www.thestar.com/news/gta/transportation/2017/12/15/trudeau-wynne-tory-on-hand-to-cut-ribbon-on-32-billi"| __truncated__ 
 $ country_code    : chr [1:537] "CA" "CA" "CA" "CA" ...

代码:

library(tidyverse)
library(tidytuesdayR)
library(scales)
library(glue)
library(countrycode)
tt <- tidytuesdayR::tt_load("2021-01-05")

transit_cost <- tt$transit_cost %>% 
  mutate_at(vars(country,city,line), as.factor) %>% 
  mutate_at(vars(start_year,end_year, real_cost), as.numeric)
transit_cost <- transit_cost %>% 
  filter(!is.na(e)) %>% 
  mutate(country = as.character(country),
         
         # if you don't convert to "char" above then due to factors it will return NA in country
         country_code = ifelse(country == "UK", "GB", country),
         country = countrycode(country_code, "iso2c", "country.name"),
         country = as.factor(country),
         tunnel_per = tunnel / length,
         rr = ifelse(rr, "Railroad", "Not Railroad"))

transit_cost

情节代码

transit_cost %>% 
  filter(country == "India") %>%
  mutate(city = fct_reorder(city, real_cost, sum)) %>% 
  
  ggplot(aes(x = real_cost, y = city, fill =  year, group = as.factor(year))) +
  geom_col() +
  scale_x_continuous(label = scales::comma_format()) +
  labs(title = "Total real cost of Projects across Indian cities",
       subtitle = "color based on Year of Project Lines")

我也试过:

transit_cost %>% 
  filter(country == "India") %>%
  mutate(city = fct_reorder(city, real_cost, sum)) %>% 

  # added this to order them
  group_by(as.factor(year)) %>% 
  arrange(desc(year)) %>% 
  
  ggplot(aes(x = real_cost, y = city, fill =  year, group = as.factor(year))) +
  geom_col() +
  scale_x_continuous(label = scales::comma_format()) +
  labs(title = "Total real cost of Projects across Indian cities",
       subtitle = "color based on years of Project Lines")

标签: rggplot2

解决方案


尝试这个:

transit_cost %>% 
  filter(country == "India") %>%
  mutate(city = fct_reorder(city, real_cost, sum)) %>% 
  
  ggplot() +
  geom_col(aes(x = real_cost, y = city, group = -year, fill = year)) +
  scale_x_continuous(label = scales::comma_format()) +
  labs(title = "Total real cost of Projects across Indian cities",
       subtitle = "color based on Year of Project Lines")

我所做的唯一更改是并group = -year删除factorfill = year


推荐阅读