r - 根据 R 中另一个数据框中的行对数据框中的行进行子集化
问题描述
只是我们正在使用的数据的快照
我想要做的是识别存在超过 90% 的 5 类的块 (BlockId),然后从数据集中删除所有这些块。我开始对数据进行子集化,subset(NLCD2008,Class==5 & Percent< .90)
这给了我一个 DF,其中有一列包含应该删除的块,如下所示:
> dput(ids)
structure(list(BLOCKID = c(100, 131, 179, 200, 222, 236, 238,
241, 244, 254, 257, 258, 265, 266, 27, 278, 57, 63, 69, 75, 81
), Class = c("5", "5", "5", "5", "5", "5", "5", "5", "5", "5",
"5", "5", "5", "5", "5", "5", "5", "5", "5", "5", "5"), CA = c(22983987.0806,
24692082.1724, 23533460.3724, 23401233.5635, 24116398.1926, 23766711.1699,
24795140.5362, 24876914.4067, 24898552.2795, 24985030.0734, 25012822.6465,
24993341.0278, 25041230.4987, 25049166.7966, 22372955.0846, 24737206.1697,
24104160.9584, 24922870.2331, 24943920.0281, 24162534.823, 23096329.0313
), TLA = c(25018769.0617, 25057087.1604, 25149935.9177, 25176830.9298,
25207224.138, 24802986.7321, 24852905.0566, 24883383.5601, 24898641.1381,
24985030.0734, 25012822.6465, 25049866.3254, 25090169.5911, 25072609.4832,
24830593.7725, 25144460.7117, 24935516.21, 24930068.7064, 24947519.2647,
24961803.5077, 24974601.3436), MSI = c(1.69665962298056, 1.31048429936865,
1.33110171648693, 1.36242160001161, 1.27666751812728, 1.22789953816493,
1.26867391259833, 1.25128851571841, 1.18533526393745, 1.18792224187668,
1.18520978795299, 1.39406482047182, 1.24884906769663, 1.24939571303602,
1.31731564029142, 1.59900472213938, 1.38890295951441, 1.20315890311899,
1.18325402703837, 1.27998393051198, 1.47485350719615), Percent = c(0.918669780432366,
0.985433063880751, 0.935726454707888, 0.929474945784445, 0.956725661682217,
0.958219726785611, 0.997675743730222, 0.99974002115169, 0.999996431186766,
1, 1, 0.997743489052367, 0.998049471438513, 0.999065008107126,
0.901023764859709, 0.983803409161585, 0.96665979382185, 0.999711253370988,
0.999855727675293, 0.967980331050461, 0.92479270093409)), row.names = c(NA,
-21L), class = c("tbl_df", "tbl", "data.frame"))
我想从这里做的是从这个子集中获取 21 个唯一的块 ID,并将它们从原始数据中删除。所以这个子集将块 27,57,63.... 识别为不合适的块,我希望能够获取该列表并将它们从原始数据中删除。
解决方案
你可以试试这个:
NLCD2008[ !(with(NLCD2008, Class==5 & Percent > .90)), ]
使用subset()
# remove all blocks that contain greater than 90% of class 5 from NLCD2008 dataset.
subset(NLCD2008, !(Class==5 & Percent > .90))
# get filtered block ids
ids <- subset(NLCD2008, Class == 5 & Percent > 0.9)
# remove the block ids from original data.
NLCD2008[!(NLCD2008$BLOCKID %in% unique(ids$BLOCKID)), ]
推荐阅读
- python - Python:为什么下次我使用 \t 而不是空格字符时,单词会转到下一次?
- xcode - 确定是否正在使用 Xcode StoreKit 配置文件
- c# - 无法检测 EF Core 中阴影属性的更改
- node.js - 如何解析通过 Insomnia 发送的 JSON
- visual-studio-code - Visual Studio Code 任务的工作目录
- python - 如何实现数字抗混叠滤波器?
- symfony - Symfony 5.2 - 在 Twig 中渲染刺激控制器
- arrays - 我如何找出c中的数组中有多少可用空间?
- node.js - 如何将变量添加到文本区域(MERN)
- python-3.x - 使用 Pyinstaller 打包 SpaCy 模型:E050 找不到模型