excel - Excel VBA Redshift 查询性能改进
问题描述
我有一个启用了 excel 宏的工作簿,它为用户提供了输入一些参数以在查询中用作过滤器 (WHERE) 子句的选项。这又被提供给查询。我有大约 3 个不使用过滤器的查询和 4 个或 5 个查询,具体取决于选择使用过滤器运行的过滤器。查询复杂度各不相同。
查询针对 Redshift 集群运行。(所有数据都是保密的,RS只是内部连接,所以我不能给出整个查询或任何东西,只是例子)
3 个小查询是 1-2 行。其余 5 行中的 3 或 4 行约为 40 行,第 5 行约为 100。
直接在没有过滤器的集群上运行时:返回约 42400 行和 23 列
3 个小查询在不到 3 秒左右的时间内运行并加载到 Excel 文件中
- 中查询 1:在集群上 - ~1 秒
- 中查询 2:在集群上 ~5 秒
- 中查询 3:在集群上 - ~9 秒
- 大型查询 1:在集群上 - ~24 秒
现在问题出在这里,当我在 vba 中运行这些查询时,使用以下每个查询来更新列表对象(示例代码)需要 980.59(~16.4 分钟)秒
CS = "ODBC;Driver={Amazon Redshift (x64)};SERVER={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;UID=user;PASSWORD=fakepasswrod;sslmode=require"
With Sheet2.ListObjects.Add(SourceType:=0, Source:=CS, Destination:=Sheet2.Range("$A$1")).QueryTable
.CommandText = Sql
.RefreshStyle = xlInsertDeleteCells
.AdjustColumnWidth = True
.ListObject.DisplayName = "Name_of_LO_1"
.Refresh BackgroundQuery:=False
End With
此外,我必须让用户能够执行通配符、逗号分隔列表和过滤器的单个条目。从单元格值构建该部分不需要很长时间。
我必须使用类似于以下的大型 if 语句构建过滤器
'Filter Fields
C_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D1").Value)
S_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D2").Value)
F_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D3").Value)
s_year = ThisWorkbook.Sheets(Sheet1.Name).Range("D4").Value
Scen = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D5").Value & "'"
prior_s_year_1 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D6").Value & "'"
prior_Scen_1 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D7").Value & "'"
prior_s_year_2 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D8").Value & "'"
prior_Scen_2 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D9").Value & "'"
cat = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D10").Value)
subcat = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D11").Value)
If Site_List = "" And Cluster_List = "" And FBN_List = "" Then
response = MsgBox("You have chosen no Site, Cluster or FBN filters, this will pull all data and may take some time" & vbNewLine & "Do you wish to continue?", vbYesNo)
If response = vbNo Then
Call MsgBox("Exiting data retrieval, please enter Site, Cluster or FBN filters and restart", vbOKOnly)
Call DeleteConnections
Exit Sub
End If
ElseIf C_List = "ALL" Then
UserDefinedFilters = " bd.reg IN ( SELECT DISTINCT c FROM att_1 ) "
ElseIf S_List <> "" And C_List <> "" And F_List <> "" Then
S_List = Replace(S_List, ", ", ",")
C_List = Replace(C_List, ", ", ",")
F_List = Replace(F_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')" & _
vbNewLine & " AND UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')" & _
vbNewLine & " AND UPPER(f) in ('" & Replace(F_List, ",", "','") & "')"
ElseIf S_List <> "" And C_List <> "" And F_List = "" Then
S_List = Replace(S_List, ", ", ",")
Cluster_List = Replace(C_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')" & _
vbNewLine & " AND UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')"
ElseIf S_List <> "" And C_List = "" And F_List = "" Then
S_List = Replace(S_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')"
ElseIf S_List = "" And C_List <> "" And F_List = "" Then
C_List = Replace(C_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')"
ElseIf S_List = "" And C_List = "" And F_List <> "" Then
If InStr(1, F_List, ",") > 0 Then
F_List = Replace(F_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & Replace(UCase(F_List), ",", "','") & "')"
ElseIf InStr(1, F_List, "*") > 0 Then
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & Replace(UCase(F_List), "*", "") & "%'"
ElseIf InStr(1, F_List, "ABC") > 0 Then
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & UCase(Left(F_List, 12)) & "%'"
Else
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & UCase(F_List) & "')"
End If
ElseIf S_List = "" And C_List <> "" And F_List <> "" Then
If InStr(1, F_List, ",") > 0 Then
F_List = Replace(F_List, ", ", ",")
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & Replace(UCase(F_List), ",", "','") & "')"
ElseIf InStr(1, F_List, "*") > 0 Then
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & Replace(UCase(F_List), "*", "") & "%'"
Else
UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & UCase(F_List) & "')"
End If
End If
'Cat and SubCat Filters
If cat <> "" And subcat <> "" Then
cat = Replace(cat, ",", "','")
subcat = Replace(subcat, ",", "','")
BCSFilters = BCSFilters & " AND UPPER(sca.cat) IN ('" & cat & "')" & _
vbNewLine & "AND UPPER(sca.subcat) in ('" & subcat & "')"
ElseIf cat <> "" And subcat = "" Then
cat = Replace(cat, ",", "','")
BCSFilters = BCSFilters & " AND UPPER(sca.cat) IN ('" & cat & "')"
ElseIf cat = "" And subcat <> "" Then
subcat = Replace(subcat, ",", "','")
BCSFilters = BCSFilters & " AND UPPER(sca.subcat) IN ('" & subcat & "')"
End If
以上只是两组,但它应该让您了解我必须做什么来构建 where 子句。
我找不到使用 ADODB 让记录集工作的方法,我不确定这是否会更快。如果可能的话,我需要做这个 DSNless,因为该文件被广泛的用户使用。任何人都可以想到的任何事情都可能有助于减少查询中的大量时间?
编辑:
添加我为记录集尝试的代码:
Dim conn As Object
Dim rs As Object
Set conn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
CS = "Driver={Amazon Redshift (x64)};DATA SOURCE={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;UID=user;PWD=fakepasswrod;sslmode=require"
conn.Open CS
Set RegAtt = ThisWorkbook.Sheets(Sheet6.Name)
RegAtt.Cells.Clear
RegSql = "SELECT cl,reg,curr FROM schema.table1"
rs.Open RegSql
With RegAtt.ListObjects.Add(xlSrcQuery, rs, Destination:=RegAtt.Range("$A$1")).QueryTable
'.CommandText = RegSql
.RefreshStyle = xlInsertDeleteCells
.AdjustColumnWidth = True
.ListObject.DisplayName = "LO_2"
.Refresh BackgroundQuery:=False
End With
该连接字符串我得到一个驱动程序未找到错误。
这个CS = "Driver={Amazon Redshift (x64)};SERVER={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;ID=user;PASSWORD=fakepasswrod;sslmode=require"
我得到 3709 - 连接不能用于执行此操作。在这种情况下,它要么是关闭的,要么是无效的。
解决方案
可能是这条线.AdjustColumnWidth = True
导致了性能下降?(因为它必须加载数据以确定自动宽度)。
Application.ScreenUpdating
您是否考虑过使用set toFalse
和Application.Calculation
set to 来执行大部分代码xlCalculationManual
?
这可能值得一试,看看它是否能提高性能。如果是这样:
- 您可以在屏幕更新被禁用期间将一些适当的用户显示消息放在适当的位置。
- 好的做法是存储然后恢复 and 的值
ScreenUpdating
,Calculation
以便环境保持在子例程开始时的状态
推荐阅读
- go - 使用 helm go sdk 删除 Kubernetes 中的所有资源
- uml - 循环的活动图迭代固定次数,如果迭代结束,则指向不同的活动
- python - 在 Python 中预测股票买入或持有
- php - 嗨,我把项目放在 cpanel 共享主机上,但是它要发送的电子邮件 iget 错误消息 Mailer [SMTP] 未定义
- html - 如何将 onClick() 事件添加到图标格式?
- java - MapsId没用过,是自动理解的吗?
- apache-spark - Spark 从新位置读取,保持输出目录相同
- r - 为什么数据表中会出现浮点精度错误
- visual-studio - Blend 2019 中的“没有 MainWindow.xaml 的代码编辑器”错误
- c - 在C中插入二叉搜索树