首页 > 解决方案 > as.Date 函数无法正确格式化数据集中的日期的问题

问题描述

我正在使用下面的数据集,我想使用 ggplot2 绘制图表。但是,在数据上调用 ggplot2 会产生一个奇怪的图表,这可能是日期格式错误的结果。dput() 显示:

    structure(list(date = structure(c(18283, 18284, 18285, 18288, 
    18289, 18290, 18291, 18292, 18295, 18296, 18297, 18298, 18299, 
    18302, 18303, 18304, 18305, 18306, 18309, 18310, 18311, 18312, 
    18313, 18316, 18317, 18318, 18319, 18320, 18323, 18324, 18325, 
    18326, 18327, 18330, 18331, 18332, 18333, 18334, 18337, 18338, 
    18339, 18340, 18341, 18344, 18345, 18346, 18347, 18348, 18351, 
    18352, 18353, 18354, 18355, 18358, 18359, 18360, 18361, 18362, 
    18365, 18366, 18367, 18368, 18369, 18372, 18373, 18374, 18375, 
    18376, 18379, 18380, 18381), class = "Date"), oil_price = c("56.76", 
    "55.51", "54.09", "53.09", "53.33", "53.29", "52.19", "51.58", 
    "50.06", "49.59", "50.87", "50.94", "50.34", "49.59", "50.0", 
    "51.13", "51.41", "52.03", ".", "52.1", "53.31", "53.77", "53.36", 
    "51.36", "49.78", "48.67", "47.17", "44.83", "46.78", "47.27", 
    "46.78", "45.9", "41.14", "31.05", "34.47", "33.13", "31.56", 
    "31.72", "28.96", "26.96", "20.48", "25.09", "19.48", "23.33", 
    "21.03", "20.75", "16.6", "15.48", "14.1", "20.51", "20.28", 
    "25.18", "28.36", "26.21", "23.54", "24.97", "22.9", ".", "22.36", 
    "20.15", "19.96", "19.82", "18.31", "-36.98", "8.91", "13.64", 
    "15.06", "15.99", "12.17", "12.4", "15.04")), row.names = c(NA, 
    -71L), class = "data.frame")

字符串():

'data.frame':   71 obs. of  2 variables:
    $ date     : Date, format: "2020-01-22" "2020-01-23" "2020-01-24" "2020- 
    01-27" ...
    $ oil_price: chr  "56.76" "55.51" "54.09" "53.09" ...

对应数据的奇异ggplot图

要生成我使用的图表:

g2 <- ggplot(data = wti,  mapping = aes(x = date, y = oil_price)) +
  geom_line() +
   labs(title = "Daily Oil Prices of Q1 2020",
   x = "Date",
   y = "Oil Prices")

我在另一个数据集+图表中遇到了类似的问题,其中使用以下命令重新格式化集合中的日期:

file_name$date <- as.Date(file_name$date, '%m/%d/%Y')

用图表解决了这个问题。但是,当我在这个集合上使用相同的命令时,我没有运气,并且仍然卡在一个奇怪的图表上。

对于如何将数据格式修复为正确格式的任何建议,我将不胜感激,谢谢!

标签: rggplot2

解决方案


您的 x 轴上有日期变量——看起来不错。问题是y轴!您oil_price的 is character(string) class,而不是数字(注意数字周围的引号和str()输出中的“chr”缩写)。将其转换为数字,file_name$oil_price <- as.numeric(file_name$oil_price)一切都应该没问题。

请注意,如果该列是factor类,您也会遇到同样的问题,但要转换factornumeric您应该通过characterfile_name$numeric_column <- as.numeric(as.character(file_name$factor_column))


推荐阅读