apache-spark - How do I print out a spark.sql object?
问题描述
I have a spark.sql object that includes a couple of variables.
import com.github.nscala_time.time.Imports.LocalDate
val first_date = new LocalDate(2020, 4, 1)
val second_date = new LocalDate(2020, 4, 7)
val mydf = spark.sql(s"""
select *
from tempView
where timestamp between '{0}' and '{1}'
""".format(start_date.toString, end_date.toString))
I want to print out mydf
because I ran mydf.count
and got 0 as the outcome.
I ran mydf
and got back mydf: org.apache.spark.sql.DataFrame = [column: type]
I also tried println(mydf)
and it didn't return the query.
There is this related question, but it does not have the answer.
How can I print out the query?
解决方案
最简单的方法是将您的查询存储到 avariable
然后打印出变量以获取查询。
- 用于
variable
_spark.sql
Example:
In Spark-scala:
val start_date="2020-01-01"
val end_date="2020-02-02"
val query=s"""select * from tempView where timestamp between'${start_date}' and '${end_date}'"""
print (query)
//select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
spark.sql(query)
In Pyspark:
start_date="2020-01-01"
end_date="2020-02-02"
query="""select * from tempView where timestamp between'{0}' and '{1}'""".format(start_date,end_date)
print(query)
#select * from tempView where timestamp between'2020-01-01' and '2020-02-02'
#use same query in spark.sql
spark.sql(query)
推荐阅读
- asp.net - 使用模型数据进行 jqueryui 自动完成的有效方法
- ruby-on-rails - FactoryBot ActiveRecord::RecordInvalid:验证失败:需要年份
- spring - Consul 用于微服务架构中的身份验证
- ruby-on-rails - 具有 ActiveRecord 最大结果的 nil 守卫
- python - 获取 Jinja 模板中所有未定义变量的行号
- javascript - Javascript 创建一个新元素并将图像附加到它
- arrays - 如何在 NumPy 中横向组合 3 个向量?
- python - 有效地旋转 3D 阵列中的块/窗口(矢量化扩散?)
- r - R - 使用 read_fwf 读取特殊字符
- jquery - 使用 Textarea 值数组映射表单输入字段数组