首页 > 解决方案 > Join two kdb tables based on 1 exact value and closest time value

问题描述

I have a table called mySec.

Map[sec] = expiry is fixed i.e. for one sec the expiry will be constantly same.

time          sec  expiry           V1
----------------------------------------
09:25:01.012  100  2021.01.01     0.2032
09:25:16.012  104  2021.06.01     0.1952
09:25:31.026  106  2020.12.01     0.1625
09:25:46.032  111  2021.11.31     0.2625
09:26:01.012  110  2021.04.31     0.3625
09:26:16.048  104  2021.06.01     0.1955
09:26:31.027  106  2020.12.01     0.1629

I have another table called myParam

time          expiry        V2
----------------------------------------
09:25:01.007  2021.01.01     2.3
09:25:01.018  2021.12.01     2.8
09:25:15.025  2021.06.01     4.1
09:25:16.025  2021.06.01     4.0
09:25:31.012  2020.12.01     2.1
09:25:46.032  2021.04.31     2.2
09:25:46.059  2021.11.31     5.2
09:26:23.012  2021.04.31     1.9
09:26:58.012  2021.06.01     3.3
09:27:09.012  2020.12.01     3.5

The result called resTable should be a table based on the fact that expiry should match exactly and row picked in myParam should be nearest to the time in mySec.

For e.g. if consider the 6th row in mySec at time=09:26:16.048, its' expiry is 2021.06.01 and for this expiry in myParam there are 3 rows. Out of these 3 rows, I want to pick the row with latest time such that time <= 09:26:16.048.

time          sec  expiry           V1     V2
-----------------------------------------------
09:25:01.012  100  2021.01.01     0.2032   2.3
09:25:16.012  104  2021.06.01     0.1952   4.1
09:25:31.026  106  2020.12.01     0.1625   2.1
09:25:46.032  111  2021.11.31     0.2625   NaN
09:26:01.012  110  2021.04.31     0.3625   2.2
09:26:16.048  104  2021.06.01     0.1955   4.0
09:26:31.027  106  2020.12.01     0.1629   2.1

The number of rows in resTable would be same as in mySec. I don't care if some element of V2 is NaN or 0 or some other value.

How do I achieve this in q?

标签: databasekdb

解决方案


That kind of join can be done using aj which takes the last/most recent value in an interval. Below is how to apply for your use case (with the dates cleaned up):

q)aj[`expiry`time;mySec;myParam]
time         sec expiry     V1     V2
--------------------------------------
09:25:01.012 100 2021.01.01 0.2032 2.3
09:25:16.012 104 2021.06.01 0.1952 4.1
09:25:31.026 106 2020.12.01 0.1625 2.1
09:25:46.032 111 2021.11.30 0.2625
09:26:01.012 110 2021.04.30 0.3625 2.2
09:26:16.048 104 2021.06.01 0.1955 4
09:26:31.027 106 2020.12.01 0.1629 2.1

Note that the order of the columns is important, an exact match will be done on expiry and <= check done on the time columns.

Additionally this type of join will be more efficient if attributes are applied correctly on your table. This is discussed in good detail in the aj documentation on code.kx.com.


推荐阅读