首页 > 解决方案 > AWS Glue SelectFields 和 Filter 不采用动态值

问题描述

我编写了一个 AWS Glue 脚本,它使用SelectFields()Filter()方法进行字段选择和过滤。我已经用静态值对它们进行了测试并且工作正常,但是,当以相同格式传递动态值时它们不起作用。知道为什么不采用动态值吗?我通过传递其中一个动态值进行了测试,对于这种情况,两种方法都有效。

请注意,传递的密钥(filterkey)在静态或动态时有效

wordstoFilter = ['USA', 'France']
columnstoSelect = ['cust_id', 'custname', 'state']


#join and return all list values in single quote along with comma
fltr_string =', '.join(["'{}'".format(value) for value in wordstoFilter])
select_string =', '.join(["'{}'".format(value) for value in columnstoSelect ])


filterkey = "country"
#below statement works with static value
#country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x["country"] in ["USA"]))
country_filter_dyf = Filter.apply(frame=custData, f=(lambda x: x[filterkey] in [fltr_string]))

##Select case
#below statement works with static value
#selected_fields_dyf = SelectFields.apply(frame = custData, paths = ['cust_id', 'cust_name', 'state', 'country'])

#Below one doesn't work
selected_dyf = SelectFields.apply(frame = custData, paths = [select_string ])

标签: pythonpython-3.xaws-glue

解决方案


如我所见,paths 参数希望您提供一个列表,但您提供一个 str 对象:

>>> type(['cust_id', 'cust_name', 'state', 'country'])
<class 'list'>
>>> type(select_string)
<class 'str'>

您是否尝试过直接给出清单?

>>> type(columnstoSelect)
<class 'list'>

columnstoSelect = ['cust_id', 'custname', 'state']
selected_dyf = SelectFields.apply(frame = custData, paths = columnstoSelect )

推荐阅读