hive - Amazon Athena 在地理 json 上返回“HIVE_CURSOR_ERROR: HIVE_CURSOR_ERROR”
问题描述
我正在尝试使用 Amazon Athena 读取 geojson 文件。
我的输入数据的头部如下所示:
{
"type": "FeatureCollection",
"name": "sql_statement",
"features": [
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22482 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -981739.267883020918816, 30855.609566356935829 ], [ -981739.267883020918816, 31355.354737498135364 ], [ -981241.022986860014498, 31355.354737498135364 ], [ -981241.022986860014498, 30855.609566356935829 ], [ -981739.267883020918816, 30855.609566356935829 ] ] ] } },
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22483 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -981241.022986860014498, 30855.609566356935829 ], [ -981241.022986860014498, 31355.354737498135364 ], [ -980742.778090699226595, 31355.354737498135364 ], [ -980742.778090699226595, 30855.609566356935829 ], [ -981241.022986860014498, 30855.609566356935829 ] ] ] } },
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22484 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -980742.778090699226595, 30855.609566356935829 ], [ -980742.778090699226595, 31355.354737498135364 ], [ -980244.533194538322277, 31355.354737498135364 ], [ -980244.533194538322277, 30855.609566356935829 ], [ -980742.778090699226595, 30855.609566356935829 ] ] ] } },
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22485 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -980244.533194538322277, 30855.609566356935829 ], [ -980244.533194538322277, 31355.354737498135364 ], [ -979746.288298377417959, 31355.354737498135364 ], [ -979746.288298377417959, 30855.609566356935829 ], [ -980244.533194538322277, 30855.609566356935829 ] ] ] } },
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22486 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -979746.288298377417959, 30855.609566356935829 ], [ -979746.288298377417959, 31355.354737498135364 ], [ -979248.043402216513641, 31355.354737498135364 ], [ -979248.043402216513641, 30855.609566356935829 ], [ -979746.288298377417959, 30855.609566356935829 ] ] ] } },
{ "type": "Feature", "properties": { "gridsize": 500.0, "big_gid": 353, "little_gid": 22487 }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -979248.043402216513641, 30855.609566356935829 ], [ -979248.043402216513641, 31355.354737498135364 ], [ -978749.798506055609323, 31355.354737498135364 ], [ -978749.798506055609323, 30855.609566356935829 ], [ -979248.043402216513641, 30855.609566356935829 ] ] ] } },
我使用地理空间文档作为模板定义了表格:
CREATE external TABLE IF NOT EXISTS testdb.grid_500
(
gridsize double,
big_gid int,
little_gid int,
geometry binary
)
ROW FORMAT SERDE 'com.esri.hadoop.hive.serde.JsonSerde'
STORED AS INPUTFORMAT 'com.esri.json.hadoop.EnclosedJsonInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
LOCATION 's3://jdl-athena/grid/'
;
然而,我对表运行的任何查询都会返回相同的、相当无用的错误:
Your query has the following error(s): HIVE_CURSOR_ERROR: HIVE_CURSOR_ERROR This query ran against the "testdb" database, unless qualified by the query. Please post the error message on our forum or contact customer support with Query Id: 25d3da93-5cfd-46bb-9f77-4eb014679ba6.
任何想法为什么我可能会收到此错误或如何调试它?我什至无法弄清楚如何获得足够的信息来诊断可能发生的事情。
解决方案
因此,经过多次 Google Fu 之后,我发现(当然)ESRI 已经为地理数据创建了自己的 JSON 格式。我使用的是开放标准GeoJson
,ESRI 格式称为ESRI Enclosed JSON
. 对于几何 JSON,Athena 仅支持 ESRI 格式,不支持 GEOJSON。
我在这里发现了这种区别: https ://github.com/Esri/gis-tools-for-hadoop/issues/29
所以看起来我需要不同格式的输入数据。
推荐阅读
- c# - 掷骰子模拟器中的随机类和 if 语句?
- javascript - 如何确保页面已被查看 X 秒?
- mysql - 硬盘克隆后无法访问 MySQL 数据库(本地)
- c# - 无法从存储卡 SLE5528 读取
- python - 仅使用可打印 ASCII 字符集 (32-126) 的 Caesar Cipher 蛮力阶段的 Python 协助
- python - 使用 SMOTE 后导致高误报的不平衡数据集
- sql - Oracle 错误“无效标识符 00904.00000 -”%s:无效标识符“加入时
- javascript - autocomplete How to avoid duplicate selection
- regex - 正则表达式接受 0 到 20 之间的数字,后跟可选的十进制数字
- angular - @ViewChild 的属性不会在父组件中更新