首页 > 解决方案 > 使用 SQL 和/或 Python 将 BigQuery GIS 表转换为 GeoJSON 类型的特征集合

问题描述

我有一些 BigQuery 表,其中包含表示船轨的空间数据(经度、纬度、点和线串)。我正在尝试将它们放入 GeoJSON 以获取 API 输出。我有一种方法使用 Python 将我的表查询到 pandas 数据框中,然后我可以使用一个函数来创建一个特征集合。但是,每天有这么多的数据,如果我可以执行某种有效的线串并集并简化,那将是理想的。

我目前的方法:

# Query BigQuery table
sql = """SELECT 
    lon, lat, lead_lon, lead_lat, 
    CAST(CAST(timestamp AS DATE) as string) as date,
    UNIX_SECONDS(timestamp) as unix_secs,
    CAST(ship_num AS STRING) AS ship_num,
    op,
    CAST(knots AS FLOAT64) AS knots,
    point,
    linestring
    FROM `ship_segments`  
    WHERE timestamp BETWEEN '2020-04-16' AND '2020-04-17';"""

# Make into pandas dataframe
df = client.query(sql).to_dataframe()

#df to geojson fn
def data2geojson(df):
        features = []
        insert_features = lambda X: features.append(
                geojson.Feature(geometry=geojson.LineString(([X["lead_lon"], X["lead_lat"], X["knots"], X["unix_secs"]],
                                                             [X["lon"], X["lat"], X["knots"], X["unix_secs"]])),
                                properties=dict(date=X["date"],
                                                mmsi=X["ship_num"],
                                                operator=X["op"]
                                                )))
        df.apply(insert_features, axis=1)

        geojson_obj = geojson.dumps(geojson.FeatureCollection(features, indent=2, sort_keys=True), sort_keys=True, ensure_ascii=False)
        return(geojson_obj)

results = data2geojson(df)

这将返回一个 GeoJSON:

{"features": [{"geometry": {"coordinates": [[-119.049945, 33.983277, 10.5502, 1587104709], [-119.034677, 33.975823, 10.5502, 1587104709]], "type": "LineString"}, "properties": {"date": "2020-04-17", "mmsi": "235098383", "operator": "Evergreen Marine Corp"}, "type": "Feature"}, {"geometry": {"coordinates": [[-120.176933, 34.282107, 22.7005, 1587114969], [-120.144453, 34.275147, 22.7005, 1587114969]], "type": "LineString"}, "properties": {"date": "2020-04-17", "mmsi": "235098383", "operator": "Evergreen Marine Corp"}, "type": "Feature"}, {"geometry": {"coordinates": [[-118.361737, 33.64647, 11.3283, 1587096305], [-118.356308, 33.643713, 11.3283, 1587096305]], "type": "LineString"}, "properties": {"date": "2020-04-17", "mmsi": "538005412", "operator": "Scorpio MR Pool Ltd"}, "type": "Feature"}, {"geometry": {"coordinates": [[-118.414667, 33.673013, 12.7684, 1587097278], [-118.411707, 33.671493, 12.7684, 1587097278]], "type": "LineString"}, "properties": {"date": "2020-04-17", "mmsi": "538005412", "operator": "Scorpio MR Pool Ltd"}, "type": "Feature"}, {"geometry": {"coordinates": [[-119.377783, 34.062612, 10.5456, 1587102119], [-119.384212, 34.064217, 10.5456, 1587102119]], "type": "LineString"}, "properties": {"date": "2020-04-17", "mmsi": "636018225", "operator": "Ocean Network Express Pte Ltd"}, "type": "Feature"}], "indent": 2, "sort_keys": true, "type": "FeatureCollection"}

但我正在尝试类似的东西:

select 
ship_num,
date(timestamp) as date,
AVG(speed_knots) as avg_speed_knots,
st_union_agg(linestring) as multiline
from(
SELECT 
  *,
  row_number() OVER w AS num,
  ST_GeogPoint(lon,lat) as geom,
  LEAD(ST_GeogPoint(lon,lat)) OVER w AS geom2,
  ST_MAKELINE((ST_GeogPoint(lon,lat)), (LEAD(ST_GeogPoint(lon,lat)) OVER w)) AS linestring,
  LEAD(STRING(timestamp), 0) OVER w AS t1, 
  LEAD(STRING(timestamp), 1) OVER w AS t2,
  FROM
  `ship_data`
  where timestamp >= '2020-04-10'
  WINDOW w AS (PARTITION BY ship_num ORDER BY timestamp)) AS q
group by
ship_num, date(timestamp);

这给了我表格中的多行字符串,但随后我需要简化并将它们放入 GeoJSON FeatureCollection 输出中。

任何不使用 PostGIS 的想法?

标签: sqlpython-3.xgoogle-bigquerygeojsonmultilinestring

解决方案


推荐阅读