首页 > 解决方案 > Neat way to plot this correlation using ggplot2?

问题描述

I have this dataset

                        airline avail_seat_km_per_week   Number    Year
  1:                 Aer Lingus              320906734       2 1985-99
  2:                  Aeroflot*             1197672318      76 1985-99
  3:      Aerolineas Argentinas              385803648       6 1985-99
  4:                Aeromexico*              596871813       3 1985-99
  5:                 Air Canada             1865253802       2 1985-99
 ---                                                                           
108:      United / Continental*             7139291291      14 2000-14
109: US Airways / America West*             2455687887      11 2000-14
110:           Vietnam Airlines              625084918       1 2000-14
111:            Virgin Atlantic             1005248585       0 2000-14
112:            Xiamen Airlines              430462962       2 2000-14

These are some instances of the dataset:

data.frame(airline=c("Aer Lingus", "Aeroflot*", "Aerolineas Argentinas", "Aeromexico*", "Air Canada", "Aer Lingus", "Aeroflot*", "Aerolineas Argentinas", "Aeromexico*", "Air Canada"), Number=c(2, 76, 6, 3, 2,0 ,6,1,5,2), Year=c("1985-99", "1985-99", "1985-99", "1985-99", "1985-99", "2000-14", "2000-14", "2000-14", "2000-14", "2000-14"))

which includes the number of crashes of airlines around the world in 2 different periods, 85-99 and 00-14, I want to plot a scatterplot that displays the number of crashes in period 85-99 against period 00-14, what is a neat way to do it using dplyr and ggplot2 packages, preferably using pipes?.

Please let me know if there are something I could do to further specify the problem. Appreciate your help!

标签: rggplot2

解决方案


在寻求有关绘图ggplot的帮助时,如果您非常清楚每个维度(x、y、颜色等)的数据是什么,这将很有帮助。

library(tidyr)
library(ggplot2)

# (calling your data d)
d %>%
  # widen the data so each plot dimension gets a column
  pivot_wider(names_from = Year, values_from = Number) %>%
  # use backticks for non-standard column names (because of the dash in this case)
  ggplot(aes(x = `1985-99`, y = `2000-14`, color = airline)) +
  geom_point()

在此处输入图像描述


推荐阅读