首页 > 解决方案 > 在 Panda/Geopanda 中合并两个数据集时出现问题

问题描述

d = {'col1': ['Jan', 'Willem'], 'name': ['Moddergat', 'Winthagen']}
data = pd.DataFrame(data=d)
data

嘿伙计们,所以上面的代码是我要运行以合并两个数据集的测试。我试图合并的数据集如下所示:

osm_id  name    type    population  geometry
5574    48291277    Zwaagwesteinde  village 1000    POINT (6.03895 53.25643)
1333    42895259    Poppendamme village 0   POINT (3.55466 51.52072)
142 41994373    Winthagen   village 0   POINT (5.93158 50.86299)
3612    46201554    De Glip hamlet  0   POINT (4.61127 52.33054)
659 42427709    Lange Hout  hamlet  0   POINT (6.03483 51.34534)
1044    42685042    Venweg  hamlet  0   POINT (4.94120 51.45961)
4138    47132813    Zuidermeer  village 1000    POINT (4.97614 52.66399)
5912    48470661    Moddergat   village 1000    POINT (6.07969 53.40367)
5047    47872376    Sibrandabuorren village 1000    POINT (5.72101 53.06785)
4979    47811814    Idsegahuizum    village 1000    POINT (5.41902 53.04249)

不知何故,当我将第一个数据集变成一个带有计数器的数据集时,它可以工作,但是当我合并同一个计数器时,它输出全为零。有谁知道为什么不匹配?

#this outputs a correct counter
data['count'] = 1
dataByNeighbourhood = data.groupby('name').count()[['count']].reset_index()
dataByNeighbourhood['name'] = dataByNeighbourhood['name'].str.lower()
dataByNeighbourhood.sort_values('count', ascending=False).head(10)

#this outputs a counter with all zero's
merged = regions.set_index('name').join(dataByNeighbourhood.set_index('name'))
merged = merged.reset_index()
merged = merged.fillna(0)
merged[['name', 'type', 'population', 'geometry', 'count']].sample(5)
print(merged['count'].max)

非常感谢您的帮助:) PS。抱歉奇怪的数据集我不知道如何在这里设置样式

标签: pythonpandasgeopandas

解决方案


推荐阅读