python-3.x - 如何将新坐标分配给xarray中的多索引
问题描述
我正在尝试将新坐标分配给 xarray DataArray 的 multiIndex。
我有一个包含 2 个主要维度(“经度”、“纬度”)和单个多索引(“状态”)的 dataArray。
这是 DataArray 结构:
print(dataArray)
<xarray.DataArray (longitude: 5000, latitude: 3000)>
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]])
Coordinates:
* longitude (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
* latitude (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
states (latitude, longitude) float64 nan nan nan nan ... nan nan nan nan
“州”多索引仅包含整数值,我想转换它们,或添加第二个带有“命名坐标”的多索引(即:美国、意大利、德国、巴西......)。
一旦有了一个命名的“状态”多索引,人们就可以很容易地通过它的正确名称选择一个给定的状态——从可用的索引中。
下面是一个可重现的脚本。它取自这里:
import pandas as pd
pd.set_option('display.width', 50000)
pd.set_option('display.max_rows', 50000)
pd.set_option('display.max_columns', 5000)
import geopandas
from rasterio import features
from affine import Affine
import numpy as np
import xarray as xr
from cartopy.io import shapereader
def transform_from_latlon(lat, lon):
lat = np.asarray(lat)
lon = np.asarray(lon)
trans = Affine.translation(lon[0], lat[0])
scale = Affine.scale(lon[1] - lon[0], lat[1] - lat[0])
return trans * scale
def rasterize(shapes, coords, fill=np.nan, **kwargs):
"""Rasterize a list of (geometry, fill_value) tuples onto the given
xray coordinates. This only works for 1d latitude and longitude
arrays.
"""
transform = transform_from_latlon(coords['latitude'], coords['longitude'])
out_shape = (len(coords['latitude']), len(coords['longitude']))
raster = features.rasterize(shapes, out_shape=out_shape,
fill=fill, transform=transform,
dtype=float, **kwargs)
return xr.DataArray(raster, coords=coords, dims=('latitude', 'longitude'))
if '__main__' == __name__:
# this shapefile is from natural earth data
# http://www.naturalearthdata.com/downloads/10m-cultural-vectors/10m-admin-1-states-provinces/
resolution = '10m'
category = 'cultural'
name = 'admin_0_countries'
shpfilename = shapereader.natural_earth(resolution, category, name)
# read the shapefile using geopandas
states = geopandas.read_file(shpfilename)
South_America = states[states['SUBREGION'] == 'South America'].reset_index(drop=True)
state_ids = {k: i for i, k in enumerate(South_America['NAME_LONG'])}
shapes = [(shape, n) for n, shape in enumerate(South_America.geometry)]
LONGITUDE = np.linspace(-145, -15, num=5000)
LATITUDE = np.linspace(-85, 25, num=3000)
ds = xr.DataArray(coords=(LONGITUDE, LATITUDE), dims=['longitude', 'latitude'])
ds['states'] = rasterize(shapes, ds.coords)
# trying to assign new coordinates to the dimension:
try:
ds = ds.assign_coords(states = South_America['NAME_LONG'])
except ValueError:
print("message error", "cannot add coordinates with new dimensions to a DataArray")
# ds = ds.expand_dims({'names':South_America['NAME_LONG']}) # --> this does not work
Array = np.random.randn(LATITUDE.size, LONGITUDE.size)
dArray_Brazil = xr.DataArray(Array, coords=(LATITUDE, LONGITUDE), dims=['latitude', 'longitude'])
import matplotlib.pyplot as plt
quadmash = dArray_Brazil.plot()
ax = ds.states.where(ds.states != 'Brazil').plot(ax=quadmash.axes)
plt.show()
理想情况下,我希望将 DataArray 结构作为以下两个选项之一:
选项1)
<xarray.DataArray (longitude: 5000, latitude: 3000)>
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]])
Coordinates:
* longitude (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
* latitude (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
states (latitude, longitude) string Brazil, USA Germany ...
选项 2)
<xarray.DataArray (longitude: 5000, latitude: 3000)>
array([[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
...,
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan],
[nan, nan, nan, ..., nan, nan, nan]])
Coordinates:
* longitude (longitude) float64 -145.0 -145.0 -144.9 ... -15.05 -15.03 -15.0
* latitude (latitude) float64 -85.0 -84.96 -84.93 ... 24.93 24.96 25.0
states (latitude, longitude) float64 nan nan nan nan ... nan nan nan nan
Named_states (latitude, longitude) string Brazil, USA Germany ...
解决方案
这是替换坐标值的一种方法:
temp = 15 + 8 * np.random.randn(2, 2, 3)
precip = 10 * np.random.rand(2, 2, 3)
lon = [[-99.83, -99.32], [-99.79, -99.23]]
lat = [[42.25, 42.21], [42.63, 42.59]]
states = [[1,2],[3,2]]
ds = xr.Dataset({'temperature': (['x', 'y', 'time'], temp),
'precipitation': (['x', 'y', 'time'], precip)},
coords={'lon': (['x', 'y'], lon),
'lat': (['x', 'y'], lat),
'time': pd.date_range('2014-09-06', periods=3),
'states': (['x','y'], states)})
ds
<xarray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 lat (x, y) float64 42.25 42.21 42.63 42.59 * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 states (x, y) int64 1 2 3 2 Dimensions without coordinates: x, y Data variables: temperature (x, y, time) float64 1.096 19.28 16.27 ... 19.25 20.38 4.981 precipitation (x, y, time) float64 9.09 7.486 2.288 ... 3.639 0.6625 8.19
transdict = {'1':'Brazil', '2':'Germany', '3':'USA'} # need dictionary for all mappings
ds.states.values = ds.states.astype(str)
for key, value in transdict.items():
ds.states.values = np.where(ds.states.values == key, value, ds.states.values)
ds
<xarray.Dataset> Dimensions: (time: 3, x: 2, y: 2) Coordinates: lon (x, y) float64 -99.83 -99.32 -99.79 -99.23 lat (x, y) float64 42.25 42.21 42.63 42.59 * time (time) datetime64[ns] 2014-09-06 2014-09-07 2014-09-08 states (x, y) <U21 'Brazil' 'Germany' 'USA' 'Germany' Dimensions without coordinates: x, y Data variables: temperature (x, y, time) float64 1.096 19.28 16.27 ... 19.25 20.38 4.981 precipitation (x, y, time) float64 9.09 7.486 2.288 ... 3.639 0.6625 8.19
推荐阅读
- r - 无法在 R 4.0.1 中安装包 tidyverse
- azure-iot-edge - 如何为 100 多个边缘设备配置单个模块
- python - 如何在招摇领域添加示例
- flutter - 如何在颤振中创建自定义工具栏?
- flutter - 如何在颤动的列表视图中设置卡片的宽度?
- javascript - 如何存储从 API 检索的值并在整个应用程序中使用它
- windows - 如何隐藏 tcp 端口侦听器 powershell 脚本?
- javascript - Vue.js:无法从云 Firestore 中检索数据
- kubernetes - 在 kubernetes 中创建 yaml 文件时出现此错误
- python - 如何获得 PCA 的权重