首页 > 解决方案 > 在R中将日期名称转换为日期格式

问题描述

我想将此数据转换为日期格式并创建具有月-年值的新列:

month         : Factor w/ 10 levels "apr","aug","dec",..: 7 7 7 7 7 7 7 7 7 7 ...
day           : chr [1:41188] "mon" "mon" "mon" "mon" ...
year          : num [1:41188] 2008 2008 2008 2008 2008 ...

我做一个dput()

dput(head(df))

df <-
structure(list(month = structure(c(7L, 7L, 7L, 7L, 7L, 7L), 
.Label = c("apr", "aug", "dec", "jul", "jun", "mar", "may", 
"nov", "oct", "sep"), class = "factor"), day = c("mon", "mon", 
"mon", "mon", "mon", "mon"), year = c(2008, 2008, 2008, 2008, 
2008, 2008)), class = "data.frame", row.names = c(NA, -6L))

主要问题是月份和日期列,因为格式是因子和字符

我尝试下面的句子:

as.integer(factor(df$month, levels=month.abb))

和这个:

match(df$month, month.abb)

我做到了:

df$date<-paste(as.character(df$month), df$year)

这有效并返回:

$ date          : chr [1:41188] "may 2008" "may 2008" "may 2008" "may 2008" 

如何更改为日期格式?

标签: rdatecharacter

解决方案


我会为您列出的每个工作日任意选择“第一天”。为了让它更有趣,我将更改工作日,以便我们的数据有一些可变性。

df <-
structure(list(month = structure(c(7L, 7L, 7L, 7L, 7L, 7L), 
.Label = c("apr", "aug", "dec", "jul", "jun", "mar", "may", 
"nov", "oct", "sep"), class = "factor"), day = c("mon", "tue", 
"wed", "fri", "sat", "sun"), year = c(2008, 2008, 2008, 2008, 
2008, 2008)), class = "data.frame", row.names = c(NA, -6L))

df
#   month day year
# 1   may mon 2008
# 2   may tue 2008
# 3   may wed 2008
# 4   may fri 2008
# 5   may sat 2008
# 6   may sun 2008

从这里开始,我们需要确定每个月的第一天是什么,然后找到当天或之后的第一天。

firstdow <- as.POSIXlt(paste(df$year, df$month, "01", sep = "-"), format = "%Y-%b-%d")$wday
# ?strptime says with '%u' that monday is 1
datadow <- match(df$day, c("mon", "tue", "wed", "thu", "fri", "sat", "sun"))
datadom <- (firstdow + datadow - 1) %% 7 + 1
df$date <- as.Date(paste(df$year, df$month, datadom, sep = "-"), format = "%Y-%b-%d")
df
#   month day year       date
# 1   may mon 2008 2008-05-05
# 2   may tue 2008 2008-05-06
# 3   may wed 2008 2008-05-07
# 4   may fri 2008 2008-05-02
# 5   may sat 2008 2008-05-03
# 6   may sun 2008 2008-05-04

并证明这提出了正确的日期来获得一周的第一天:

format(df$date, format = "%a")
# [1] "Mon" "Tue" "Wed" "Fri" "Sat" "Sun"

推荐阅读