首页 > 解决方案 > 来自 Python 的 Athena InvalidRequestException

问题描述

我正在尝试从 S3 存储桶中读取 CSV 文件并通过 Python 在 Athena 中创建表。但我在执行它时低于预期 -


数据库查询开始

{'QueryExecutionId': '9cc82243-4220-47d0-8b63-0aa4f01fd590', 'ResponseMetadata': {'RequestId': '1c74bec6-663a-42ef-b9d1-73c7372eb4e1', 'HTTPStatusCode': 200, 'HTTPHeaders': {' content-type': 'application/x-amz-json-1.1', 'date': 'Thu, 08 Nov 2018 15:37:11 GMT', 'x-amzn-requestid': '1c74bec6-663a-42ef- b9d1-73c7372eb4e1','content-length':'59','connection':'keep-alive'},'RetryAttempts':0}}

表创建开始

回溯(最近一次通话最后):

文件“C:/Users/Doc/PycharmProjects/aws-athena-repo/athena/app.py”,第 61 行,在 QueryExecutionContext={'Database': 'athenadb'})

文件“C:\Program Files\Python37\lib\site-packages\botocore\client.py”,第 320 行,在 _api_call return self._make_api_call(operation_name, kwargs)

_make_api_call 中的文件“C:\Program Files\Python37\lib\site-packages\botocore\client.py”,第 623 行引发 error_class(parsed_response, operation_name)

botocore.errorfactory.InvalidRequestException:调用 StartQueryExecution 操作时发生错误 (InvalidRequestException):第 1:8 行:输入“CREATE EXTERNAL”没有可行的替代方案


这是我的代码示例——

print("Start of DB Query")
# Create a new database
db_query = 'CREATE DATABASE IF NOT EXISTS athenadb;'
response = client.start_query_execution(
    QueryString=db_query,
    ResultConfiguration={'OutputLocation': 's3://mybucket'})
print(response)

table_query = '''
CREATE EXTERNAL TABLE IF NOT EXISTS `athenadb.testtable`(
    `id` int,
    `ident` string,
    `type` string,
    `name` string,
    `latitude_deg` double,
    `longitude_deg` double,
    `continent` string,
    `iso_country` string,
    `iso_region` string,
    `municipality` string,
    `scheduled_service` string,
    `gps_code` string,
    `iata_code` string,
    `local_code` string,
    `home_link` string,
    `wikipedia_link` string,
    `keywords` string 
)
ROW FORMAT DELIMITED 
  FIELDS TERMINATED BY ',' 
  LINES TERMINATED BY '\n' 
WITH SERDEPROPERTIES ( 
  'escape.delim'='\\')
STORED AS TEXTFILE
LOCATION 's3://mybucket/folder/' ;'''

print("Start of table creation")

response1 = client.start_query_execution(
    QueryString=table_query,
    ResultConfiguration={'OutputLocation': 's3://mybucket'},
    QueryExecutionContext={'Database': 'athenadb'})
print(response1)

我不确定问题出在 ROW FORMAT DELIMITED 还是其他方面。我认为我的代码很好。

详细步骤将不胜感激!

提前感谢!

标签: pythonamazon-web-servicesboto3amazon-athenabotocore

解决方案


我使用所有字段作为字符串和 SERDEPROPERTIES 作为 OpenCSVSerde


推荐阅读