首页 > 解决方案 > R面板数据集查找所有时变变量

问题描述

考虑以下形式的数据集:

person year A  B  C
1      2000 1 0.5 cat
1      2001 1 0.7 NA
1      2002 1 0.5 dog
2      2000 3 0.3 dog
2      2001 3 0.4 dog
2      2002 3 0.9 dog

该数据集具有面板结构:随着时间的推移,人们被跟踪。在不考虑 NA 的情况下,我如何在 R 中找到哪些变量是时变的,哪些是时不变的?有人可以帮我吗?提前致谢!

标签: r

解决方案


这是dplyr版本:

# load the package
library(dplyr)

# create the sample data frame
df <- data.frame(
    person = c("A", "B", "A", "B")
    , year = c(2000, 2000, 2001, 2001)
    , A = c("k", "k", "k", "k")
    , B = c("m", "l", "m", "m")
    , C = c("m", NA, "m", "m")
)

# a function that gives `TRUE` if all values in a vector are the same
allTheSame <- function(x) length(unique(x)) == 1

# and now
df %>% # we take the dataframe
    group_by(person) %>% # group it by person
    summarise_all(allTheSame) # and apply our function to all columns

推荐阅读