首页 > 解决方案 > 如何选择列中具有相同值的行并保持此特征

问题描述

我有来自巴西的所有城市的完整数据框。我只想要一些预定义的城市。我有一个包含这些预定义城市的专栏。然后我想使用我的数据框中的所有列,但只选择与所有城市的列城市和与预定义城市的列重合的行。

data = read.csv(file="C:/Users/guilherme/Desktop/data.csv", header=TRUE, sep=";")
data
> AllCities Year1990 Year200 PredefinedCities CharacCities1 CharacCities2
1         A        2       4                C 12            5
2         B        2       2                A 11            10
3         C        3       4                F 09            2
4         D        4       2                 
5         E        5       6                 
6         F        6       2                 

我想要以下

> data
  AllCities Year1990 Year200 PredefinedCities CharacCities1 CharacCities2
1         C        3       4                C 12            5
2         A        2       4                A 11            10
3         F        6       2                F 09            2

标签: rdatabasedataframe

解决方案


你需要merge——

merge(
  data[, c("AllCities", "Year1990", "Year200")], 
  data[, c("PredefinedCities", "CharacCities1", "CharacCities2")],
  by.x = "AllCities", by.y = "PredefinedCities"
)

  AllCities Year1990 Year200 CharacCities1 CharacCities2
1         A        2       4            11            10
2         C        3       4            12             5
3         F        6       2             9             2

注意- 您的数据格式不寻常。如果可以,您应该修复数据源,以便在创建 csv 文件之前分别为您AllCitiesPreferredCities表提供或什至正确连接它们。

数据 -

structure(list(AllCities = c("A", "B", "C", "D", "E", "F"), Year1990 = c(2L, 
2L, 3L, 4L, 5L, 6L), Year200 = c(4L, 2L, 4L, 2L, 6L, 2L), PredefinedCities = c("C", 
"A", "F", "", "", ""), CharacCities1 = c(12L, 11L, 9L, NA, NA, 
NA), CharacCities2 = c(5L, 10L, 2L, NA, NA, NA)), .Names = c("AllCities", 
"Year1990", "Year200", "PredefinedCities", "CharacCities1", "CharacCities2"
), class = "data.frame", row.names = c(NA, -6L))

推荐阅读