首页 > 解决方案 > 如何在R中具有多列的数据中获得超过百分位阈值的累积降雨量

问题描述

我有以下数据:

 df<- structure(list(year = c(1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L), mon = c(6L, 
 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 
 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 
 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), day = c(1L, 2L, 3L, 4L, 
 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 1L, 
 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 
 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 
 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L), PCx = c(1.839, 
 1.754, 1.632, 1.476, 1.287, 1.07, 0.828, 0.566, 0.289, 0.001, 
 -0.29, -0.579, -0.858, -1.123, -1.367, -1.583, -1.766, -1.911, 
 -2.015, -2.075, -2.089, -2.056, -1.977, -1.855, -1.692, -1.494, 
 -1.265, -1.011, -0.74, -0.46, -0.176, 0.102, 0.37, 0.619, 0.844, 
 1.041, 1.206, 1.335, 1.427, 1.481, 1.497, 1.476, 1.42, 1.331, 
 1.211, 1.064, 0.893, 0.701, 0.493, 0.272, 0.043, -0.192, -0.428, 
 -0.66, -0.884, -1.096, -1.291, -1.465, -1.611, -1.727, -1.807, 
 -1.847, -1.844, -1.796, -1.702, -1.56, -1.373, -1.142, -0.874, 
 -0.572, -0.245, 0.099, 0.45, 0.799, 1.135, 1.447, 1.727, 1.964, 
 2.151, 2.283, 2.356, 2.368, 2.32, 2.215, 2.057, 1.854, 1.614, 
 1.347, 1.063, 0.774, 0.488, 0.217, -0.032, -0.251, -0.435, -0.581, 
 -0.689, -0.758, -0.793, -0.797, -0.777, -0.74, -0.693, -0.644, 
 -0.6, -0.568, -0.552, -0.556, -0.582, -0.63, -0.699, -0.784, 
 -0.882, -0.985, -1.086, -1.179, -1.256, -1.309, -1.332, -1.321, 
 -1.271, -1.18), PCy = c(0.696, 0.942, 1.173, 1.384, 1.571, 1.729, 
 1.853, 1.941, 1.99, 1.998, 1.964, 1.886, 1.767, 1.608, 1.411, 
 1.179, 0.918, 0.633, 0.33, 0.016, -0.303, -0.618, -0.922, -1.208, 
 -1.469, -1.698, -1.89, -2.039, -2.143, -2.199, -2.205, -2.162, 
 -2.071, -1.935, -1.756, -1.54, -1.292, -1.018, -0.725, -0.419, 
 -0.107, 0.204, 0.507, 0.795, 1.063, 1.304, 1.514, 1.688, 1.823, 
 1.914, 1.961, 1.961, 1.913, 1.818, 1.677, 1.491, 1.263, 0.998, 
 0.7, 0.375, 0.029, -0.33, -0.694, -1.054, -1.401, -1.727, -2.022, 
 -2.279, -2.489, -2.646, -2.744, -2.782, -2.755, -2.665, -2.513, 
 -2.303, -2.041, -1.735, -1.392, -1.022, -0.637, -0.246, 0.14, 
 0.51, 0.855, 1.168, 1.441, 1.671, 1.855, 1.991, 2.081, 2.128, 
 2.135, 2.109, 2.055, 1.98, 1.891, 1.795, 1.697, 1.602, 1.515, 
 1.437, 1.369, 1.31, 1.26, 1.214, 1.169, 1.12, 1.063, 0.993, 0.905, 
 0.797, 0.664, 0.506, 0.323, 0.117, -0.11, -0.353, -0.606, -0.864, 
 -1.118, -1.362), phase = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 
 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 
 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 
 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
 6L), Amp.nrm. = c(1.966, 1.991, 2.01, 2.023, 2.031, 2.033, 2.03, 
 2.022, 2.011, 1.998, 1.985, 1.973, 1.965, 1.961, 1.964, 1.974, 
 1.99, 2.014, 2.042, 2.075, 2.111, 2.147, 2.182, 2.214, 2.241, 
 2.261, 2.274, 2.276, 2.267, 2.246, 2.212, 2.164, 2.104, 2.031, 
 1.949, 1.859, 1.767, 1.679, 1.601, 1.539, 1.501, 1.49, 1.508, 
 1.55, 1.611, 1.683, 1.758, 1.828, 1.888, 1.934, 1.961, 1.97, 
 1.96, 1.934, 1.895, 1.85, 1.807, 1.772, 1.757, 1.767, 1.807, 
 1.876, 1.971, 2.083, 2.204, 2.327, 2.444, 2.549, 2.637, 2.707, 
 2.755, 2.783, 2.792, 2.782, 2.757, 2.72, 2.674, 2.62, 2.562, 
 2.501, 2.441, 2.381, 2.324, 2.273, 2.228, 2.191, 2.164, 2.147, 
 2.138, 2.136, 2.138, 2.139, 2.135, 2.123, 2.1, 2.063, 2.012, 
 1.948, 1.873, 1.79, 1.702, 1.616, 1.534, 1.46, 1.395, 1.34, 1.293, 
 1.251, 1.212, 1.176, 1.144, 1.118, 1.104, 1.107, 1.134, 1.185, 
 1.261, 1.356, 1.464, 1.578, 1.693, 1.803), LAOAG.CITY = c(12.7, 
 0, 0, 0, 0, 0, 0, 0, 0, 1.3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.3, 
 15.4, 1, 4.6, 0, 0, 15.2, 1.8, 1.3, 39.4, 0, 0, 0, 104.7, 30.5, 
 0, 0, 0, 23.7, 0, 0, 6.8, 0, 0, 0, 5.7, 0, 0, 0, 0, 33, 0, 0, 
 0, 3.3, 6.4, 1.5, 183.4, 42.9, 32.4, 15.5, 35.8, 33.1, 11.6, 
 96.8, 18.3, 1.3, 10.2, 17.3, 4.1, 35.1, 0, 32.1, 13.4, 8.9, 13.7, 
 132.9, 35.9, 0, 0, 0, 0, 2.8, 0.4, 1.3, 4.8, 0, 0, 0, 0, 9.4, 
 0, 0, 0, 0.5, 0, 0, 0, 0, 0, 0, 2.8, 0, 0, 0, 0, 0, 18.3, 7.8, 
 53.3, 10, 0, 19.3, 0, 0, 0, 0, 0, 2.4, 0, 0, 0, 0), APARRI = c(7, 
 0, 1, 1, 0, 0, 2, 0.5, 0, 0, 0.5, 0, 0, 0, 0, 0, 4.5, 0, 0, 0, 
 0, 0, 0, 0, 1.5, 0, 0, 0, 0, 0, 0, 17, 57.5, 2.5, 7.5, 13, 0, 
 0, 2, 0, 2.5, 0, 0, 0, 0, 15, 0, 0, 0, 7.5, 0, 0, 0, 7.5, 0, 
 1, 10.5, 2, 3.5, 29, 73, 17, 1, 1.5, 1.5, 7.5, 1, 2, 4.5, 0, 
 0, 11, 11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
 0, 0, 4, 7.5, 0.5, 16, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0.5, 2, 
 1, 0, 0, 0, 0, 0, 0, 0, 0, 40, 1.5, 1, 1.5, 0)), row.names = 72:193, class = "data.frame")

此数据中的重要列是:年、月、日、阶段、LAOAG.CITY 和 APARRI。该数据有 8 个阶段。

我想要:

(A) get the sum of rainfall of ALL DAYS with rainfall exceeding the 95th percentile for the last two columns (call this Y)

(B) get the sum of rainfall of days exceeding the 95th percentile for each phase only (call this Xphase#)

然后我想计算这样的东西并适用于所有阶段:

pct_change = ((Xphase1-Y)/Y)*100 

我的预期输出是这样的表,其中包含两个位置的 pct_change:

phase LAOAG.CITY APARRI
1
2
3
4
5
6
7
8

到目前为止我所拥有的:

我是手动计算的,这有点乏味。这是我的脚本:

 lapply(df[8:9], quantile, prob = 0.95, names = FALSE)
 Y <- sum(df$LAOAG.CITY > 39.225)
 Xphase1 <- sum(df[which(b$phase==1),]$LAOAG.CITY > 39.225)
 Xphase2 <- sum(df[which(b$phase==2),]$LAOAG.CITY > 39.225)
 ....
 ....

两列的计算百分位数是:

$LAOAG.CITY
[1] 39.225

$APARRI
[1] 15.95


   phase LAOAG APARRI       LAOAG_PCT
   1  12.7    7.0 -0.9786375
   2  33.0   15.0 -0.9444912
   3  72.6   16.0 -0.8778806
   4 183.4   73.0 -0.6915055
   5  96.8   40.0 -0.8371741
   6  39.4    7.5 -0.9337258
   7 132.9   57.5 -0.7764508
   8  23.7    2.5 -0.9601346

我将把它应用到一个有 30 列的数据上。

有没有办法在 R 中更有效地做到这一点?

我会很感激任何帮助。

标签: rcsv

解决方案


创建数据框:

df <- data.frame(list(year = c(1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 
 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L, 1979L), mon = c(6L, 
 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 
 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 
 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 
 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 9L, 9L, 9L, 9L, 9L, 
 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 
 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L, 9L), day = c(1L, 2L, 3L, 4L, 
 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 
 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L, 1L, 
 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 
 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 
 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L, 24L, 
 25L, 26L, 27L, 28L, 29L, 30L, 31L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 
 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 16L, 17L, 18L, 19L, 20L, 
 21L, 22L, 23L, 24L, 25L, 26L, 27L, 28L, 29L, 30L), PCx = c(1.839, 
 1.754, 1.632, 1.476, 1.287, 1.07, 0.828, 0.566, 0.289, 0.001, 
 -0.29, -0.579, -0.858, -1.123, -1.367, -1.583, -1.766, -1.911, 
 -2.015, -2.075, -2.089, -2.056, -1.977, -1.855, -1.692, -1.494, 
 -1.265, -1.011, -0.74, -0.46, -0.176, 0.102, 0.37, 0.619, 0.844, 
 1.041, 1.206, 1.335, 1.427, 1.481, 1.497, 1.476, 1.42, 1.331, 
 1.211, 1.064, 0.893, 0.701, 0.493, 0.272, 0.043, -0.192, -0.428, 
 -0.66, -0.884, -1.096, -1.291, -1.465, -1.611, -1.727, -1.807, 
 -1.847, -1.844, -1.796, -1.702, -1.56, -1.373, -1.142, -0.874, 
 -0.572, -0.245, 0.099, 0.45, 0.799, 1.135, 1.447, 1.727, 1.964, 
 2.151, 2.283, 2.356, 2.368, 2.32, 2.215, 2.057, 1.854, 1.614, 
 1.347, 1.063, 0.774, 0.488, 0.217, -0.032, -0.251, -0.435, -0.581, 
 -0.689, -0.758, -0.793, -0.797, -0.777, -0.74, -0.693, -0.644, 
 -0.6, -0.568, -0.552, -0.556, -0.582, -0.63, -0.699, -0.784, 
 -0.882, -0.985, -1.086, -1.179, -1.256, -1.309, -1.332, -1.321, 
 -1.271, -1.18), PCy = c(0.696, 0.942, 1.173, 1.384, 1.571, 1.729, 
 1.853, 1.941, 1.99, 1.998, 1.964, 1.886, 1.767, 1.608, 1.411, 
 1.179, 0.918, 0.633, 0.33, 0.016, -0.303, -0.618, -0.922, -1.208, 
 -1.469, -1.698, -1.89, -2.039, -2.143, -2.199, -2.205, -2.162, 
 -2.071, -1.935, -1.756, -1.54, -1.292, -1.018, -0.725, -0.419, 
 -0.107, 0.204, 0.507, 0.795, 1.063, 1.304, 1.514, 1.688, 1.823, 
 1.914, 1.961, 1.961, 1.913, 1.818, 1.677, 1.491, 1.263, 0.998, 
 0.7, 0.375, 0.029, -0.33, -0.694, -1.054, -1.401, -1.727, -2.022, 
 -2.279, -2.489, -2.646, -2.744, -2.782, -2.755, -2.665, -2.513, 
 -2.303, -2.041, -1.735, -1.392, -1.022, -0.637, -0.246, 0.14, 
 0.51, 0.855, 1.168, 1.441, 1.671, 1.855, 1.991, 2.081, 2.128, 
 2.135, 2.109, 2.055, 1.98, 1.891, 1.795, 1.697, 1.602, 1.515, 
 1.437, 1.369, 1.31, 1.26, 1.214, 1.169, 1.12, 1.063, 0.993, 0.905, 
 0.797, 0.664, 0.506, 0.323, 0.117, -0.11, -0.353, -0.606, -0.864, 
 -1.118, -1.362), phase = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 
 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 
 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 
 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 
 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 
 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 
 6L), Amp.nrm. = c(1.966, 1.991, 2.01, 2.023, 2.031, 2.033, 2.03, 
 2.022, 2.011, 1.998, 1.985, 1.973, 1.965, 1.961, 1.964, 1.974, 
 1.99, 2.014, 2.042, 2.075, 2.111, 2.147, 2.182, 2.214, 2.241, 
 2.261, 2.274, 2.276, 2.267, 2.246, 2.212, 2.164, 2.104, 2.031, 
 1.949, 1.859, 1.767, 1.679, 1.601, 1.539, 1.501, 1.49, 1.508, 
 1.55, 1.611, 1.683, 1.758, 1.828, 1.888, 1.934, 1.961, 1.97, 
 1.96, 1.934, 1.895, 1.85, 1.807, 1.772, 1.757, 1.767, 1.807, 
 1.876, 1.971, 2.083, 2.204, 2.327, 2.444, 2.549, 2.637, 2.707, 
 2.755, 2.783, 2.792, 2.782, 2.757, 2.72, 2.674, 2.62, 2.562, 
 2.501, 2.441, 2.381, 2.324, 2.273, 2.228, 2.191, 2.164, 2.147, 
 2.138, 2.136, 2.138, 2.139, 2.135, 2.123, 2.1, 2.063, 2.012, 
 1.948, 1.873, 1.79, 1.702, 1.616, 1.534, 1.46, 1.395, 1.34, 1.293, 
 1.251, 1.212, 1.176, 1.144, 1.118, 1.104, 1.107, 1.134, 1.185, 
 1.261, 1.356, 1.464, 1.578, 1.693, 1.803), LAOAG.CITY = c(12.7, 
 0, 0, 0, 0, 0, 0, 0, 0, 1.3, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2.3, 
 15.4, 1, 4.6, 0, 0, 15.2, 1.8, 1.3, 39.4, 0, 0, 0, 104.7, 30.5, 
 0, 0, 0, 23.7, 0, 0, 6.8, 0, 0, 0, 5.7, 0, 0, 0, 0, 33, 0, 0, 
 0, 3.3, 6.4, 1.5, 183.4, 42.9, 32.4, 15.5, 35.8, 33.1, 11.6, 
 96.8, 18.3, 1.3, 10.2, 17.3, 4.1, 35.1, 0, 32.1, 13.4, 8.9, 13.7, 
 132.9, 35.9, 0, 0, 0, 0, 2.8, 0.4, 1.3, 4.8, 0, 0, 0, 0, 9.4, 
 0, 0, 0, 0.5, 0, 0, 0, 0, 0, 0, 2.8, 0, 0, 0, 0, 0, 18.3, 7.8, 
 53.3, 10, 0, 19.3, 0, 0, 0, 0, 0, 2.4, 0, 0, 0, 0), APARRI = c(7, 
 0, 1, 1, 0, 0, 2, 0.5, 0, 0, 0.5, 0, 0, 0, 0, 0, 4.5, 0, 0, 0, 
 0, 0, 0, 0, 1.5, 0, 0, 0, 0, 0, 0, 17, 57.5, 2.5, 7.5, 13, 0, 
 0, 2, 0, 2.5, 0, 0, 0, 0, 15, 0, 0, 0, 7.5, 0, 0, 0, 7.5, 0, 
 1, 10.5, 2, 3.5, 29, 73, 17, 1, 1.5, 1.5, 7.5, 1, 2, 4.5, 0, 
 0, 11, 11, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
 0, 0, 4, 7.5, 0.5, 16, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0.5, 2, 
 1, 0, 0, 0, 0, 0, 0, 0, 0, 40, 1.5, 1, 1.5, 0)))

head(df)

展示:

  year mon day   PCx   PCy phase Amp.nrm. LAOAG.CITY APARRI
1 1979   6   1 1.839 0.696     1    1.966       12.7      7
2 1979   6   2 1.754 0.942     1    1.991        0.0      0
3 1979   6   3 1.632 1.173     1    2.010        0.0      1
4 1979   6   4 1.476 1.384     1    2.023        0.0      1
5 1979   6   5 1.287 1.571     2    2.031        0.0      0
6 1979   6   6 1.070 1.729     2    2.033        0.0      0

问题一:

percentiles_95 <- lapply(df[c("LAOAG.CITY", "APARRI")], quantile, prob = 0.95, names = FALSE)
sum(subset(df, df["LAOAG.CITY"] > percentiles_95$LAOAG.CITY)["LAOAG.CITY"])
sum(subset(df, df["APARRI"] > percentiles_95$APARRI)["APARRI"])

问题(B):

phases <- unique(df["phase"])
LAOAG <- c()
APARRI <- c()

for (phase_now in c(1:max(phases))){
  df_sub <- subset(df, phase == phase_now)

  percentiles_95_sub <- lapply(df_sub[c("LAOAG.CITY", "APARRI")], quantile, prob = 0.95, names = FALSE)
  LAOAG <- c(LAOAG, sum(subset(df_sub, df_sub["LAOAG.CITY"] > percentiles_95_sub$LAOAG.CITY)["LAOAG.CITY"]))
  APARRI <- c(APARRI, sum(subset(df_sub, df_sub["APARRI"] > percentiles_95_sub$APARRI)["APARRI"]))
}

data.frame(phases, LAOAG, APARRI)

结果:

   phase LAOAG APARRI
1      1  12.7    7.0
5      2  33.0   15.0
11     3  72.6   16.0
16     4 183.4   73.0
21     5  96.8   40.0
26     6  39.4    7.5
32     7 132.9   57.5
38     8  23.7    2.5

对于问题 B,我正在计算每个阶段的第 95 个百分位,然后得到该阶段大于第 95 个百分位的观察值的总和。如果您希望将它们与所有阶段的第 95 个百分位数进行比较,则可以使用问题 A 中的 percentiles_95 而不是 percentiles_95_sub。


推荐阅读