python - 将python代码转换为python spark代码
问题描述
下面的代码在 Python 中,我想将此代码转换为 pyspark,基本上我不确定语句的代码是什么 - pd.read_sql(query,connect_to_hive) 转换为 pyspark
需要从 EDL 中提取数据,因此使用 PYODBC 与 EDL 建立连接,然后使用 sql 查询提取数据。
pyodbc 与企业数据湖的连接:
connect_to_hive = pyodbc.connect("DSN=Hive", autocommit=True)
transaction=pd.read_sql(query, connect_to_hive)
connect_to_hive.close()
#Query函数:下面只是一个基本的sql查询来复制这个问题。
query=f'''
with trans as (
SELECT
a.employee_name,
a.employee_id
FROM EMP
'''
解决方案
上述代码可以转换为 SparkSQL 代码如下:
spark = SparkSession.builder.enableHiveSupport().getOrCreate()
query=f'''
with trans as (
SELECT
a.employee_name,
a.employee_id
FROM EMP
'''
employeeDF = spark.sql(query)
employeeDF.show(truncate=False)
查询将在 Hive 上按原样运行,结果将作为 Spark DataFrame 提供给您
推荐阅读
- c# - 两个可空日期之间的 OrderBy 差异
- java - Edit "Create local variable" template to get rid of the "final" modifier in Intellij IDEA
- azure - 问题将自定义域映射到 Azure Web App
- pandas - How to replace space with a digit in middle of the mobile number
- vba - Userform runtime error 380- could not set the rowsource property - fix?
- angular - Unable To Add Angular Material To Angular Custom Library
- oauth-2.0 - Spring Boot + Security OAuth2.0 Client with Custom Provider
- groovy - GEB: Disable implicit assertions in waitFor block
- c# - How to fill a wpf tree view using mvvm from a model with "infinte" levels
- c# - How to pass a plus sign in a string from a view to controller?