首页 > 解决方案 > Variance Covariance Matrix 从向量到矩阵格式

问题描述

我有一个应该是两个向量形式的方差协方差矩阵 NxN。一个向量包含方差值,另一个向量包含协方差。我用简化的 N 数做了一个例子来说明。真正的问题是一个 1500x1500 的矩阵:

我有的:

    library(tidyverse)
    N = 4
    names <- c("a","b","c","d")

     matrix_var_cov <- matrix(data = NA, nrow = N, ncol = N) %>% 
    `colnames<-`(.,names) %>% `rownames<-`(.,names)

    variance <- as.data.frame(c("aa","bb","cc","dd")) %>% `colnames<- 
    `(.,"Covariance")

    covariance <- as.data.frame(c("ab","ac","bc","ad","bd","cd")) %>% 
    `colnames<-`(.,"Variance")

正如您从协方差数据框中注意到的那样,顺序是按列给出的。我从 B 列:AB,从 C 列:AC 和 BC,从 D 列:AD、BD 和 CD 等等。

有几种方法可以解释我上面所说的,这是我的观点。

我需要什么作为输出:

output <- data.frame(
  c("aa","ab","ac","ad"),
  c("ab","bb","bc","bd"),
  c("ac","cb","cc","cd"),
  c("ad","bd","cd","dd")) %>% 
   `colnames<-`(.,names) %>% `rownames<-`(.,names)

所以我真正需要的是用方差和协方差的信息来获取 DF。有什么聪明的方法可以做到这一点吗?不,信息的来源不能改变。

标签: rstatistics

解决方案


一个选项是直接指定对角线和非对角线元素。gdata提供函数upperTriangle以及lowerTriangle我们可以在哪里按行提供数据(以 R 为基础lower.triupper.tri“按列”提供条目)。

# Sample data
mat <- matrix(data = NA, nrow = N, ncol = N, dim = list(names, names))
variance <- c("aa","bb","cc","dd")
covariance <- c("ab","ac","bc","ad","bd","cd")

library(gdata)
diag(mat) <- variance
lowerTriangle(mat, byrow = T) <- covariance
upperTriangle(mat, byrow = T) <- lowerTriangle(mat)
mat
#    a    b    c    d
#a "aa" "ab" "ac" "ad"
#b "ab" "bb" "bc" "bd"
#c "ac" "bc" "cc" "cd"
#d "ad" "bd" "cd" "dd"

我们可以通过(1)填充上三角矩阵,(2)转置矩阵以使下三角矩阵的条目具有正确的顺序,以及(3)最后填充上三角矩阵,从而在基数 R 中实现相同的效果。

# Sample data
mat <- matrix(data = NA, nrow = N, ncol = N, dim = list(names, names))
variance <- c("aa","bb","cc","dd")
covariance <- c("ab","ac","bc","ad","bd","cd")

diag(mat) <- variance
mat[upper.tri(mat)] <- covariance
mat <- t(mat)
mat[upper.tri(mat)] <- covariance
mat
#    a    b    c    d
#a "aa" "ab" "ac" "ad"
#b "ab" "bb" "bc" "bd"
#c "ac" "bc" "cc" "cd"
#d "ad" "bd" "cd" "dd"

请注意,您的预期输出中似乎有一个错字,它提供了一个"cb"在您的协方差向量中不存在的条目。


推荐阅读