首页 > 解决方案 > 使用 R 在数据框中打印所有年份

问题描述

我有一个包含两个日期的数据框:

我想在循环中打印数据中存在的所有年份:

我试图创建一个执行此操作的代码,但它给了我一个错误:

# Libraries
library(ggplot2)
library(data.table)
library(tidyr)
library(dplyr) # select two columns

# Read data
df = read.csv2(text = "File.Name|Created.Date|Last.Accessed|Visual.Group
60be1ba43bf7cjpg|1989-11-17 06:25:22|2017-07-15 01:25:22|0
60be1ba43bf89jpg|1989-02-04 04:03:16|2021-12-17 04:03:16|1
60be1ba43bf8djpg|2017-04-22 14:57:13|2017-11-17 23:57:13|2
60be1ba43bf90jpg|2021-04-12 23:03:44|2018-11-17 05:03:44|3
60be1ba43bf93jpg|2019-08-28 18:23:16|1989-09-07 12:23:16|4
60be1ba43bf95jpg|1989-09-11 08:16:20|2020-03-17 10:16:20|5
60be1ba43bf98jpg|2018-08-01 16:56:05|2017-04-24 03:56:05|5
60be1ba43c0b2jpg|1989-06-23 23:50:52|2017-09-08 02:50:52|56
60be1ba43c0b5jpg|2019-09-01 04:29:25|2020-10-25 00:29:25|56
60be1ba43c0b8jpg|2020-08-08 07:08:47|2021-05-22 20:08:47|57
60be1ba43c0bbjpg|2018-04-11 07:32:17|2018-06-21 12:32:17|58
60be1ba43c0bdjpg|2021-05-26 08:32:28|1989-02-04 12:32:28|58
60be1ba43c0c0jpg|1989-11-25 22:22:37|2019-07-16 04:22:37|58
60be1ba43c0c4jpg|2018-02-03 10:37:57|2019-08-02 08:37:57|58
60be1ba43c0c7jpg|2018-08-18 06:36:04|1989-03-17 08:36:04|58
60be1ba43c0cajpg|2019-02-12 23:31:52|2020-06-17 13:31:52|59", 
               sep="|",stringsAsFactors=TRUE, na.strings="unknown");


# Remove duplicates (Visual group defines duplicate)
df <-df[!duplicated(df$Visual.Group), ]

# Extract year
df$Created.Date.Year <- format(as.Date(df$Created.Date, format="%Y-%m-%d"), format="%Y");
df$Last.Accessed.Year <- format(as.Date(df$Last.Accessed, format="%Y-%m-%d"), format="%Y");

# Merge years from created and last accessed
df_years <- unique(c(df$Created.Date.Year,df$Last.Accessed.Year))

number_of_years <- length(df_years) # count

for(x in 1:number_of_years) {
  year <- df_years[x, 1:1]
  cat(year, "\n"); 
}

更新:确切的预期输出:

1989
2017
2018
2019
2020
2021

标签: r

解决方案


没有循环,在你提取了这些年之后

unique(c(df$Created.Date.Year,df$Last.Accessed.Year))
[1] "1989" "2017" "2021" "2019" "2020" "2018"

编辑:每年的频率?

table(c(df$Created.Date.Year,df$Last.Accessed.Year))
1989 2017 2018 2019 2020 2021 
   5    4    3    2    3    3

推荐阅读