mysql - 连接多个表的 PSQL 问题(MIMIC II 数据库)
问题描述
我有一个包含多个表的患者数据库,我需要在结果表中连接来自不同表的特定列(预处理)。这是我的做法:
SELECT d_patients.subject_id, icustay_detail.subject_icustay_seq,
icustay_detail.icustay_id,
icustay_detail.icustay_admit_age, d_patients.sex,
icustay_detail.icustay_los/(60*24) los,
icustay_detail.weight_first/POWER(icustay_detail.height/100,2) bmi,
COALESCE( a.additives, 0 ) as additives,
COALESCE( p.procedureevents, 0 ) as procedureevents,
COALESCE( d.duration, 0 ) as duration,
COALESCE ( m.medications,0 ) as medications
FROM icustay_detail
INNER JOIN d_patients
ON icustay_detail.subject_id = d_patients.subject_id
LEFT JOIN
( SELECT subject_id, count(*) as additives
FROM additives
GROUP BY subject_id
) a
ON icustay_detail.subject_id = a.subject_id
LEFT JOIN
( SELECT subject_id, count(*) as procedureevents
FROM procedureevents
GROUP BY subject_id
) p
ON icustay_detail.subject_id = p.subject_id
LEFT JOIN
( SELECT subject_id, SUM(duration) as duration
FROM a_meddurations
GROUP BY subject_id
) d
ON icustay_detail.subject_id = d.subject_id
LEFT JOIN
( SELECT icustay_id, COUNT(DISTINCT solutionid)
AS medications
FROM medevents
GROUP BY icustay_id
) m
ON icustay_detail.icustay_id = m.icustay_id;
直到这里,代码工作正常,结果表如下
subject_id | subject_icustay_seq | icustay_id | icustay_admit_age | sex | los | bmi | additives | procedureevents | duration | medications
------------+---------------------+------------+-------------------+-----+----------------------+------------------+-----------+-----------------+----------+-------------
3 | 1 | 4 | 76.52892 | M | 6.06458333333333 | 30.18767669244 | 10 | 6 | 59145 | 2
12 | 1 | 13 | 72.36841 | M | 7.63472222222222 | | 3 | 10 | 15092 | 2
21 | 1 | 23 | 87.43771 | M | 5.90208333333333 | 21.3569005167836 | 7 | 15 | 0 | 1
21 | 2 | 24 | 87.82761 | M | 8.36527777777778 | | 7 | 15 | 0 | 1
26 | 1 | 29 | 72.00637 | M | 2.47569444444444 | | 0 | 2 | 0 | 0
31 | 1 | 34 | 72.26327 | M | 7.93819444444444 | 22.12581585657 | 6 | 6 | 18316 | 2
37 | 1 | 41 | 68.93134 | M | 1.13958333333333 | | 0 | 2 | 0 | 0
56 | 1 | 62 | 90.6427 | F | 1.84930555555556 | | 3 | 2 | 0 | 0
61 | 1 | 67 | 54.72569 | M | 2.55972222222222 | 18.168292503669 | 1 | 15 | 0 | 0
61 | 2 | 68 | 55.249 | M | 2.04791666666667 | | 1 | 15 | 0 | 1
61 | 3 | 69 | 55.26185 | M | 0.00277777777777778 | | 1 | 15 | 0 | 0
67 | 1 | 76 | 73.48396 | M | 0.214583333333333 | 23.8462015385569 | 2 | 1 | 160 | 1
78 | 1 | 89 | 48.62681 | M | 1.48958333333333 | | 1 | 1 | 2125 | 1
83 | 1 | 94 | 65.54958 | F | 2.30902777777778 | | 0 | 3 | 0 | 0
94 | 1 | 107 | 74.43468 | M | 1.09861111111111 | | 13 | 12 | 0 | 2
94 | 2 | 108 | 74.95603 | M | 22.8979166666667 | | 13 | 12 | 0 | 2
105 | 1 | 121 | 35.33793 | F | 4.99722222222222 | | 2 | 0 | 0 | 0
105 | 2 | 122 | 35.4088 | F | 1.61944444444444 | | 2 | 0 | 0 | 0
106 | 1 | 123 | 35.6095 | M | 6.1625 | 22.4312567927803 | 5 | 1 | 0 | 3
112 | 1 | 136 | 91.96166 | M | 0.840277777777778 | | 5 | 8 | 0 | 0
112 | 2 | 137 | 94.25 | M | 1.13888888888889 | | 5 | 8 | 0 | 2
问题是当我尝试从另一个表(labevents)添加另一列(异常)时,它正在添加这部分:
LEFT JOIN
( SELECT subject_id, round((COUNT(flag) * 100)::numeric / COUNT(*), 2) as abnormal
FROM labevents
GROUP BY subject_id
) ab
ON icustay_detail.subject_id = ab.subject_id
结果代码如下:
SELECT d_patients.subject_id, icustay_detail.subject_icustay_seq,
icustay_detail.icustay_id,
icustay_detail.icustay_admit_age, d_patients.sex,
icustay_detail.icustay_los/(60*24) los,
icustay_detail.weight_first/POWER(icustay_detail.height/100,2) bmi,
COALESCE( a.additives, 0 ) as additives,
COALESCE( p.procedureevents, 0 ) as procedureevents,
COALESCE( d.duration, 0 ) as duration,
COALESCE( ab.abnormal, 0 ) as abnormal,
COALESCE( m.medications,0 ) as medications
FROM icustay_detail
INNER JOIN d_patients
ON icustay_detail.subject_id = d_patients.subject_id
LEFT JOIN
( SELECT subject_id, count(*) as additives
FROM additives
GROUP BY subject_id
) a
ON icustay_detail.subject_id = a.subject_id
LEFT JOIN
( SELECT subject_id, count(*) as procedureevents
FROM procedureevents
GROUP BY subject_id
) p
ON icustay_detail.subject_id = p.subject_id
LEFT JOIN
( SELECT subject_id, SUM(duration) as duration
FROM a_meddurations
GROUP BY subject_id
) d
ON icustay_detail.subject_id = d.subject_id
LEFT JOIN
( SELECT subject_id, round((COUNT(flag) * 100)::numeric / COUNT(*), 2) as abnormal
FROM labevents
GROUP BY subject_id
) ab
ON icustay_detail.subject_id = ab.subject_id
LEFT JOIN
( SELECT icustay_id, COUNT(DISTINCT solutionid)
AS medications
FROM medevents
GROUP BY icustay_id
) m
ON icustay_detail.icustay_id = m.icustay_id;
发生的情况是“药物”列中值为 0 的患者行被删除,我不知道为什么。结果表是这个:
subject_id | subject_icustay_seq | icustay_id | icustay_admit_age | sex | los | bmi | additives | procedureevents | duration | abnormal | medications
------------+---------------------+------------+-------------------+-----+----------------------+------------------+-----------+-----------------+----------+----------+-------------
3 | 1 | 4 | 76.52892 | M | 6.06458333333333 | 30.18767669244 | 10 | 6 | 59145 | 23.15 | 2
12 | 1 | 13 | 72.36841 | M | 7.63472222222222 | | 3 | 10 | 15092 | 41.85 | 2
21 | 1 | 23 | 87.43771 | M | 5.90208333333333 | 21.3569005167836 | 7 | 15 | 0 | 43.74 | 1
21 | 2 | 24 | 87.82761 | M | 8.36527777777778 | | 7 | 15 | 0 | 43.74 | 1
31 | 1 | 34 | 72.26327 | M | 7.93819444444444 | 22.12581585657 | 6 | 6 | 18316 | 26.69 | 2
61 | 2 | 68 | 55.249 | M | 2.04791666666667 | | 1 | 15 | 0 | 41.63 | 1
67 | 1 | 76 | 73.48396 | M | 0.214583333333333 | 23.8462015385569 | 2 | 1 | 160 | 29.73 | 1
78 | 1 | 89 | 48.62681 | M | 1.48958333333333 | | 1 | 1 | 2125 | 35.75 | 1
94 | 1 | 107 | 74.43468 | M | 1.09861111111111 | | 13 | 12 | 0 | 34.10 | 2
94 | 2 | 108 | 74.95603 | M | 22.8979166666667 | | 13 | 12 | 0 | 34.10 | 2
106 | 1 | 123 | 35.6095 | M | 6.1625 | 22.4312567927803 | 5 | 1 | 0 | 37.67 | 3
112 | 2 | 137 | 94.25 | M | 1.13888888888889 | | 5 | 8 | 0 | 33.67 | 2
117 | 1 | 143 | 49.2842 | F | 1.32013888888889 | | 14 | 15 | 42785 | 47.76 | 1
117 | 2 | 144 | 49.89653 | F | 13.2493055555556 | 40.9070642360035 | 14 | 15 | 42785 | 47.76 | 3
124 | 1 | 151 | 69.63652 | M | 3.91111111111111 | 22.1382505356903 | 7 | 8 | 8560 | 24.24 | 1
124 | 2 | 152 | 71.07388 | M | 7.18958333333333 | | 7 | 8 | 8560 | 24.24 | 1
146 | 1 | 181 | 85.56421 | M | 7.57013888888889 | 34.7959879593229 | 11 | 9 | 32648 | 35.88 | 2
148 | 1 | 183 | 78.1597 | F | 15.2958333333333 | | 7 | 7 | 2826493 | 36.68 | 3
150 | 1 | 185 | 37.82158 | F | 2.26180555555556 | 14.2542487150986 | 1 | 4 | 3083 | 23.10 | 1
157 | 2 | 195 | 80.53706 | M | 4.11944444444444 | | 6 | 11 | 0 | 33.84 | 2
如果您需要表格的具体内容来尝试理解错误,请告诉我。非常感谢你。
最好的,
吉尔
解决方案
推荐阅读
- c# - 如何使用 IConfidentialClientApplication 获取安全 Web API 的令牌
- generative-adversarial-network - CycleGAN 中的 2 个鉴别器?
- android - 如何检测 NestedScrollView 何时到达底部?
- html - 尝试在两个浮动 div 下方添加一条水平线
- elasticsearch - ElasticSearch 5.6 Docker 设置密码
- css - 将现有 SCSS 文件传输到 CSS
- r - 在 dplyr 中使用 parse_number 和 mutate 替换原始变量
- django - 为什么 django 将空格填充到 char 字段的最大长度?
- r - 如何使用ggplot2绘制以相同前缀开头的多个变量之间的差异?
- ios - 让 Xcode 调试器在另一个文件中引发错误