首页 > 解决方案 > GeoPandas GeoDataFrame 转换

问题描述

亲爱的 stackoverflow 社区,

在过去的几周里,我阅读了有关 Python、Pandas 和 GeoPandas 的文档和文章。遗憾的是,编程对我来说仍然没有我希望的那么直观,而且由于我不是来自与 GeoPandas 打交道的编程背景,这对我来说是一场纯粹的噩梦。

我有一个相当复杂的(至少对我而言)geopandas.GeoDataFrame,我需要对其进行转换以进行进一步的回归分析。可悲的是,即使在 stackoverflow 和许多其他互联网页面上进行了无数次搜索后,我仍然无法以合适的方式转换我的数据。


我的 GeoDataFrame 如下所示:

         INCIDENTDATE       CATEGORY_left    CATEGORY_right  \
POLYGON                                                       
1                2009            BURGLARY        restaurant   
1                2009            HOMICIDE        restaurant   
1                2010             ASSAULT        restaurant   
1                2011             ASSAULT        restaurant   
1                2012             LARCENY        restaurant   
1                2012  AGGRAVATED ASSAULT        restaurant   
1                2012            BURGLARY        restaurant   
1                2012  DAMAGE TO PROPERTY        restaurant   
1                2013  AGGRAVATED ASSAULT        restaurant   
1                2014            BURGLARY        restaurant   
3                2010  MURDER/INFORMATION          crossing   
3                2011  AGGRAVATED ASSAULT          crossing   
3                2011            BURGLARY          crossing   
3                2011             ASSAULT          crossing   
3                2012  AGGRAVATED ASSAULT          crossing   
3                2012  MURDER/INFORMATION          crossing   
3                2013     DANGEROUS DRUGS          crossing   
3                2014  DAMAGE TO PROPERTY          crossing   
3                2015             ASSAULT          crossing   

                                                  geometry    shape_area  
POLYGON                                                                   
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06

'CATEGORY_left'是一个geopandas.GeoDataFrame使用geopandas.sjoin和几何点连接的。它包含不同类别的犯罪相关事件,如下所示:

'CATEGORY_left'

     INCIDENTDATE          CATEGORY                            geometry
0          2009             LARCENY  POINT (-83.06870000000001 42.3516)
1          2009             ASSAULT            POINT (-82.9504 42.4262)
2          2009             ASSAULT            POINT (-83.2657 42.4371)
3          2009  DAMAGE TO PROPERTY  POINT (-83.03189999999999 42.4381)
4          2009      STOLEN VEHICLE            POINT (-83.1499 42.4094)

'CATEGORY_right'也是geopandas.GeoDataFrame我加入的一个geopandas.sjoin。它包含不同的兴趣点,这些兴趣点仅取决于'POLYGON'它们的输入。它们不会随着时间而改变。

'CATEGORY_right'

          CATEGORY                                          geometry
13243          atm      POINT (-83.06221670000002 42.32472120000001)
13244          atm                    POINT (-83.0711901 42.3213266)
13245          atm             POINT (-83.0232692 42.34089829999999)
24624  supermarket             POINT (-83.2400998 42.37158820000001)
24625  supermarket                    POINT (-82.9728123 42.3872246)

为了进行回归分析,我需要它具有以下形状。

最后:

         INCIDENTDATE       TOTAL_CRIME_COUNT    RESTAURANT_COUNT\
POLYGON                                                       
1                2009                    4396                 30
1                2010                    6455                 30
1                2011                    7434                 30
1                2012                    3843                 30
1                2013                    5354                 30
1                2014                    3425                 30
3                2010                    4564                 10
3                2011                    3234                 10
3                2012                    8754                 10
3                2013                    4829                 10
3                2014                    9583                 10
3                2015                    4354                 10

                                                  geometry    shape_area  
POLYGON                                                                   
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
1        POLYGON ((-83.13630642653472 42.43895550416347...  3.959841e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06  
3        POLYGON ((-83.17870657596477 42.39269734838572...  3.918602e+06

需要注意的是:

  1. 这些行由相同的值聚合'INCIDENTDATE'
  2. 列中取每个多边形每年的总犯罪事件总和'TOTAL_CRIME_COUNT'
  3. 每个多边形的不同兴趣点的总和。每个兴趣点都需要在自己的列中。

即使对解决方案有丝毫暗示,我也会很高兴。我也对完全不同的方法来达到我的最终 DataFrame 持开放态度,因为我什至不确定我是否以正确的方式开始。

如果你能做到这一点,非常感谢!

查尔斯

PS:我为语法错误道歉。英语不是我的第一语言。

标签: pythonpandasdataframetransformationgeopandas

解决方案


推荐阅读