r - 我怎样才能将这些数据点清楚地添加到图表中,而不会让它们看起来像它们那样?
问题描述
我有两个数据框(dput()
s 在这个问题的末尾),我希望将它们绘制到同一个图表上。
我希望能够显示任何给定日期(列)的第一次和第二次预约号码,以及每个日期给予的每种疫苗接种的数量,按地点细分。我已经执行了一个count
原始数据(使用dplyr
),但我认为通过每天按站点绘制,它导致我的图表显示堆叠值而不是单个/总值:
我高度怀疑我的处理方法是错误的,这就是导致列和行看起来像它们的方式的原因;它在许多层面上似乎都是错误的。
我认为这些列被分解成段(因为它们是许多值的组合),所有这些都堆叠在一起,我相信这条线也是如此。
就这一行而言,显然有问题,因为它似乎从一列跳到下一列;没有平滑/流畅的过渡。我已将数据按单日值拆分,但这仍然会发生。
(为了这个例子,我添加了粗体颜色;这个图表不是最终形式。)
我尝试使用merge
来组合数据集,但仍然收到相同的结果;我确信有更好的方法来做到这一点。
任何建议都会很棒。
merge
数据帧的代码:
merged <- merge(df, df2, by = 1)
colnames(merged)[1] <- "apptDTS" # Change first column name
图表代码:
ggplot(merged) +
geom_col(aes(apptDTS, n.x), fill = "yellow", colour = "black") +
geom_col(aes(apptDTS, n.y), fill = "blue", colour = "black") +
geom_line(aes(x = apptDTS, y = n.x),
colour = "green") +
geom_line(aes(x = apptDTS, y = n.y),
colour = "red")
dput
年代:
df <- structure(list(FirstApptDTS = structure(c(1609718400, 1609718400,
1609718400, 1609718400, 1609804800, 1609804800, 1609804800, 1609804800,
1609891200, 1609891200, 1609891200, 1609891200, 1609977600, 1609977600,
1609977600, 1609977600, 1610064000, 1610064000, 1610064000, 1610064000,
1610150400, 1610150400, 1610150400, 1610150400, 1610409600, 1610409600,
1610409600, 1610409600, 1610409600, 1610496000, 1610496000, 1610496000,
1610496000, 1610496000, 1610582400, 1610582400, 1610582400, 1610582400,
1610582400, 1610668800, 1610668800, 1610668800, 1610668800, 1610668800,
1610755200, 1610755200, 1610755200, 1610755200, 1610755200, 1610928000,
1610928000, 1610928000, 1610928000, 1610928000, 1610928000, 1611014400,
1611014400, 1611014400, 1611014400, 1611014400, 1611014400, 1611100800,
1611100800, 1611100800, 1611100800, 1611100800, 1611100800, 1611187200,
1611187200, 1611187200, 1611187200, 1611187200, 1611273600, 1611273600,
1611273600, 1611273600, 1611273600, 1611360000, 1611360000, 1611360000,
1611360000, 1611360000, 1611360000, 1611532800, 1611532800, 1611532800,
1611532800, 1611532800, 1611532800, 1611532800, 1611619200, 1611619200,
1611619200, 1611619200, 1611619200, 1611705600, 1611705600, 1611705600,
1611705600, 1611705600, 1611792000, 1611792000, 1611792000, 1611792000,
1611792000, 1611878400, 1611878400, 1611878400, 1611878400, 1611878400,
1611964800, 1611964800, 1611964800, 1611964800, 1611964800), class = c("POSIXct",
"POSIXt"), tzone = ""), firstSiteLocation = c("GHGA", "LBVC1",
"STHSTVC", "STHSTVC", "GHGA", "LBVC1", "STHSTVC", "STHSTVC",
"GHGA", "LBVC1", "STHSTVC", "STHSTVC", "GHGA", "LBVC1", "STHSTVC",
"STHSTVC", "GHGA", "LBVC1", "STHSTVC", "STHSTVC", "GHGA", "LBVC1",
"STHSTVC", "STHSTVC", "GHGA", "LBVC1", "LBVC2", "STHSTVC", "STHSTVC",
"GHGA", "LBVC1", "LBVC2", "STHSTVC", "STHSTVC", "GHGA", "LBVC1",
"LBVC2", "STHSTVC", "STHSTVC", "GHGA", "LBVC1", "LBVC2", "STHSTVC",
"STHSTVC", "GHGA", "LBVC1", "LBVC2", "STHSTVC", "STHSTVC", "GHGA",
"LBVC1", "LBVC2", "STHSTVC", "STHSTVC", "WBVC1", "GHGA", "LBVC1",
"LBVC2", "STHSTVC", "STHSTVC", "WBVC1", "GHGA", "LBVC1", "LBVC2",
"STHSTVC", "STHSTVC", "WBVC1", "GHGA", "LBVC1", "LBVC2", "STHSTVC",
"WBVC1", "GHGA", "LBVC1", "LBVC2", "STHSTVC", "WBVC1", "GHGA",
"LBVC1", "LBVC2", "STHSTVC", "STHSTVC", "WBVC1", "GHGA", "LBVC1",
"LBVC2", "STHSTVC", "STHSTVC", "VC2", "WBVC1", "GHGA", "LBVC1",
"LBVC2", "STHSTVC", "WBVC1", "GHGA", "LBVC1", "LBVC2", "STHSTVC",
"WBVC1", "GHGA", "LBVC1", "LBVC2", "STHSTVC", "WBVC1", "GHGA",
"LBVC1", "LBVC2", "STHSTVC", "WBVC1", "GHGA", "LBVC1", "LBVC2",
"STHSTVC", "WBVC1"), VaccineTypeCD = c("DEF", "DEF", "ABC", "DEF",
"DEF", "DEF", "ABC", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF",
"DEF", "ABC", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF", "DEF",
"ABC", "DEF", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF", "DEF",
"DEF", "ABC", "DEF", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF",
"DEF", "DEF", "ABC", "DEF", "DEF", "DEF", "DEF", "ABC", "DEF",
"DEF", "DEF", "DEF", "ABC", "DEF", "DEF", "DEF", "DEF", "DEF",
"ABC", "DEF", "DEF", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF",
"DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF", "DEF", "ABC", "DEF", "DEF", "DEF", "DEF",
"DEF", "ABC", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF"), n = c(134L, 283L, 3L, 10L, 122L, 120L,
18L, 128L, 148L, 534L, 481L, 22L, 151L, 520L, 529L, 7L, 174L,
539L, 535L, 3L, 185L, 540L, 494L, 3L, 91L, 321L, 491L, 12L, 495L,
82L, 329L, 493L, 6L, 534L, 86L, 423L, 517L, 2L, 496L, 111L, 394L,
505L, 2L, 498L, 401L, 547L, 518L, 2L, 362L, 443L, 481L, 555L,
1L, 524L, 153L, 446L, 452L, 493L, 1L, 426L, 288L, 472L, 463L,
558L, 1L, 381L, 317L, 491L, 592L, 610L, 566L, 471L, 496L, 606L,
615L, 572L, 561L, 472L, 564L, 557L, 1L, 577L, 584L, 534L, 598L,
570L, 1L, 594L, 1L, 553L, 492L, 581L, 570L, 610L, 573L, 484L,
580L, 575L, 571L, 554L, 482L, 590L, 596L, 533L, 395L, 489L, 570L,
606L, 486L, 413L, 495L, 497L, 538L, 441L, 264L)), row.names = c(59L,
61L, 63L, 64L, 66L, 68L, 70L, 71L, 73L, 74L, 76L, 77L, 79L, 81L,
83L, 84L, 86L, 88L, 90L, 91L, 93L, 95L, 97L, 98L, 109L, 111L,
113L, 115L, 116L, 118L, 120L, 122L, 124L, 125L, 127L, 129L, 131L,
133L, 134L, 136L, 138L, 140L, 142L, 143L, 145L, 147L, 149L, 151L,
152L, 154L, 156L, 158L, 160L, 161L, 163L, 165L, 167L, 169L, 171L,
172L, 174L, 176L, 178L, 180L, 182L, 183L, 185L, 187L, 189L, 191L,
193L, 195L, 197L, 199L, 201L, 203L, 205L, 207L, 209L, 211L, 213L,
214L, 216L, 218L, 220L, 222L, 224L, 225L, 228L, 229L, 231L, 233L,
235L, 237L, 239L, 241L, 243L, 245L, 247L, 249L, 251L, 253L, 255L,
257L, 259L, 261L, 263L, 265L, 267L, 269L, 271L, 273L, 275L, 277L,
279L), class = "data.frame")
和
df2 <- structure(list(SecondApptDTS = structure(c(1609545600, 1609804800,
1609891200, 1609977600, 1610064000, 1610150400, 1610409600, 1610409600,
1610496000, 1610496000, 1610496000, 1610582400, 1610582400, 1610668800,
1610668800, 1610668800, 1610755200, 1611014400, 1611187200, 1611705600,
1611878400, 1611964800, NA), class = c("POSIXct", "POSIXt"), tzone = ""),
secondSiteLocation = c("GHGA", "GHGA", "GHGA", "GHGA", "GHGA",
"GHGA", "GHGA", "LBVC1", "GHGA", "LBVC1", "STHSTVC", "GHGA",
"LBVC1", "GHGA", "LBVC1", "LBVC2", "GHGA", "LBVC1", "GHGA",
"GHGA", "STHSTVC", "GHGA", NA), VaccineType2CD = c("DEF",
"DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF", "DEF",
"DEF", "DEF", "DEF", NA), n = c(1L, 1L, 254L, 199L, 274L,
269L, 325L, 157L, 284L, 197L, 2L, 295L, 123L, 257L, 123L,
1L, 1L, 1L, 4L, 2L, 1L, 3L, NA)), row.names = c("24", "28",
"31", "34", "37", "40", "47", "49", "51", "53", "55", "57", "59",
"62", "64", "66", "67", "68", "73", "75", "77", "78", "NA"), class = "data.frame")
解决方案
如果我理解正确,OP想要显示
- 任何给定日期的第一个和第二个约会号码
- 每个日期接种的每种疫苗的数量
- 按位置细分。
但是,我不确定我是否完全理解了这些要求。因此,我的回答可能需要根据 OP 的反馈进行调整。
以下是我会用我喜欢的工具做的事情(我更熟悉和更快data.table
)dplyr
。merge()
最重要的是,我不rbind()
输入第一次和第二次约会的 id 列的数据集。
library(data.table)
library(magrittr)
cols <- c("appDTS", "siteLocation", "vaccineType", "n")
combi <- list(df, df2) %>%
lapply(setDT) %>%
lapply(setnames, cols) %>%
rbindlist(idcol = "appt") %>%
.[, appt := factor(appt, labels = c("First", "Second"))]
# 1st plot
ggplot(combi) +
aes(appDTS, n, fill = appt) +
geom_col() +
scale_fill_brewer(palette = "Paired")
# 2nd plot
ggplot(combi) +
aes(appDTS, n, fill = vaccineType) +
geom_col() +
scale_fill_brewer(palette = "Accent")
# 3rd plot
ggplot(combi) +
aes(appDTS, n, fill = siteLocation) +
geom_col()
请注意,我为每个图选择了不同的调色板,以可视化不同的变量是彩色编码的。
编辑
OP已澄清:
我想要一个图,它在 x 轴上显示日期,在 y 轴上显示一个计数,并带有条形图,还有两条线表示每天接种了多少疫苗。
为了绘制每天接种的疫苗数量,我们需要进一步汇总数据。data.table
这是由
combi[!is.na(n), .(n = sum(n)), by = .(appDTS, vaccineType)]
现在,可以通过以下方式创建带有线条叠加的图
ggplot(combi) +
aes(appDTS, n, fill = appt) +
geom_col() +
scale_fill_brewer(palette = "Paired") +
geom_line(
aes(appDTS, n, colour = vaccineType),
data = combi[!is.na(n), .(n = sum(n)), by = .(appDTS, vaccineType)],
inherit.aes = FALSE, size = 1) +
scale_color_brewer(palette = "Set1")
inherit.aes = FALSE
需要避免由于聚合数据集中缺少appt
变量(映射在fill
美学上)而导致的错误消息。
推荐阅读
- python - 用今天的日期替换 CSV 文件中的“NULL”值 - Python
- vb.net - Telerik 过滤 radgrid 导出按钮后不起作用
- java - NoSuchElementException:使用扫描仪获取用户输入时找不到行?
- php - 如何通过 CURL 和 PHP 执行文件上传
- ios - 执行多个 Alamofire 请求时出现错误 (500)
- java - PACT - 修改标头以包含 oAuth2 令牌
- parse-platform - 聊天应用程序使用解析服务器列出每个对等方的最后消息
- nativescript - 如何在 NativeScript 中更改状态文本颜色 (iOS)?
- azure - 如何将两个项目部署到一个 Web 应用程序中 - 使用子目录部署
- python - python中Klipfolio的曲线拟合算法