首页 > 解决方案 > Construct a correlation matrix from table with non-aligned dates

问题描述

I had a look at this post to compute the correlation matrix given an input table.

My issue is that my columns are not consistently aligned.

For instance:

([]date:.z.d+til 100;a:100?10f;b:(10#0n),90?1f;c:(90?1f),(10#0n))

date       a          b           c          
---------------------------------------------
2019.11.18 6.018138               0.1357346  
2019.11.19 2.365495               0.9805366  
2019.11.20 0.5136894              0.2821858  
2019.11.21 9.013581               0.4946025  
2019.11.22 1.0842                 0.967023   
2019.11.23 4.543989               0.6901084  
2019.11.24 4.597627               0.6303566  
2019.11.25 2.18889                0.01415349 
2019.11.26 3.050233               0.2783062  
2019.11.27 5.259109               0.6675121  
2019.11.28 5.175593   0.1684333   0.3706485  
2019.11.29 5.14162    0.5885103   0.4183277

I don't want to remove all the rows containing null values before computing the correlation matrix, as I have many columns and the intersection of all the dates could be the empty set.

Instead, I would like to apply n*(n-1)\2 operations to populate the correlation matrix that I would construct myself, by taking the joint series of a and b, and putting the result in my correlation matrix C at C[1,2] and C[2,1].

I insist on the n*(n-1)\2 operations as the answers in the post I mentioned above seem to do n*n operations (my n is roughly equal to 700).

标签: correlationkdb

解决方案


This might get you something close to what you're looking for:

q)m:(1_cols t)!();
q){x{m::m,'key[x]!1f,value(1_x)cor\:y;1_x}/x}1_flip t

q)m^flip m
 | a          b           c
-| ----------------------------------
a| 1          0.01418217  0.04938382
b| 0.01418217 1           -0.06297328
c| 0.04938382 -0.06297328 1

Uses only 3 cor operations.


推荐阅读