首页 > 解决方案 > Excel VBA Redshift 查询性能改进

问题描述

我有一个启用了 excel 宏的工作簿,它为用户提供了输入一些参数以在查询中用作过滤器 (WHERE) 子句的选项。这又被提供给查询。我有大约 3 个不使用过滤器的查询和 4 个或 5 个查询,具体取决于选择使用过滤器运行的过滤器。查询复杂度各不相同。

查询针对 Redshift 集群运行。(所有数据都是保密的,RS只是内部连接,所以我不能给出整个查询或任何东西,只是例子)

3 个小查询是 1-2 行。其余 5 行中的 3 或 4 行约为 40 行,第 5 行约为 100。

直接在没有过滤器的集群上运行时:返回约 42400 行和 23 列

3 个小查询在不到 3 秒左右的时间内运行并加载到 Excel 文件中

现在问题出在这里,当我在 vba 中运行这些查询时,使用以下每个查询来更新列表对象(示例代码)需要 980.59(~16.4 分钟)秒

CS = "ODBC;Driver={Amazon Redshift (x64)};SERVER={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;UID=user;PASSWORD=fakepasswrod;sslmode=require"

With Sheet2.ListObjects.Add(SourceType:=0, Source:=CS, Destination:=Sheet2.Range("$A$1")).QueryTable
    .CommandText = Sql
    .RefreshStyle = xlInsertDeleteCells
    .AdjustColumnWidth = True
    .ListObject.DisplayName = "Name_of_LO_1"
    .Refresh BackgroundQuery:=False
End With

此外,我必须让用户能够执行通配符、逗号分隔列表和过滤器的单个条目。从单元格值构建该部分不需要很长时间。

我必须使用类似于以下的大型 if 语句构建过滤器

'Filter Fields
C_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D1").Value)
S_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D2").Value)
F_List = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D3").Value)
s_year = ThisWorkbook.Sheets(Sheet1.Name).Range("D4").Value
Scen = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D5").Value & "'"
prior_s_year_1 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D6").Value & "'"
prior_Scen_1 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D7").Value & "'"
prior_s_year_2 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D8").Value & "'"
prior_Scen_2 = "'" & ThisWorkbook.Sheets(Sheet1.Name).Range("D9").Value & "'"
cat = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D10").Value)
subcat = UCase(ThisWorkbook.Sheets(Sheet1.Name).Range("D11").Value)


If Site_List = "" And Cluster_List = "" And FBN_List = "" Then
    response = MsgBox("You have chosen no Site, Cluster or FBN filters, this will pull all data and may take some time" & vbNewLine & "Do you wish to continue?", vbYesNo)
    If response = vbNo Then
        Call MsgBox("Exiting data retrieval, please enter Site, Cluster or FBN filters and restart", vbOKOnly)
        Call DeleteConnections
        Exit Sub
    End If

ElseIf C_List = "ALL" Then
    UserDefinedFilters = " bd.reg IN ( SELECT DISTINCT c FROM att_1 ) "
    
ElseIf S_List <> "" And C_List <> "" And F_List <> "" Then
    S_List = Replace(S_List, ", ", ",")
    C_List = Replace(C_List, ", ", ",")
    F_List = Replace(F_List, ", ", ",")
    UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')" & _
    vbNewLine & " AND UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')" & _
    vbNewLine & " AND UPPER(f) in ('" & Replace(F_List, ",", "','") & "')"
    
ElseIf S_List <> "" And C_List <> "" And F_List = "" Then
    S_List = Replace(S_List, ", ", ",")
    Cluster_List = Replace(C_List, ", ", ",")
    UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')" & _
    vbNewLine & " AND UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')"
    
ElseIf S_List <> "" And C_List = "" And F_List = "" Then
    S_List = Replace(S_List, ", ", ",")
    UserDefinedFilters = UserDefinedFilters & " UPPER(s) in ('" & Replace(S_List, ",", "','") & "')"
    
ElseIf S_List = "" And C_List <> "" And F_List = "" Then
    C_List = Replace(C_List, ", ", ",")
    UserDefinedFilters = UserDefinedFilters & " UPPER(reg) in ('" & Replace(C_List, ",", "','") & "')"
    
ElseIf S_List = "" And C_List = "" And F_List <> "" Then
    If InStr(1, F_List, ",") > 0 Then
        F_List = Replace(F_List, ", ", ",")
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & Replace(UCase(F_List), ",", "','") & "')"
    ElseIf InStr(1, F_List, "*") > 0 Then
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & Replace(UCase(F_List), "*", "") & "%'"
    ElseIf InStr(1, F_List, "ABC") > 0 Then
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & UCase(Left(F_List, 12)) & "%'"
    Else
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & UCase(F_List) & "')"
    End If
    
ElseIf S_List = "" And C_List <> "" And F_List <> "" Then
    If InStr(1, F_List, ",") > 0 Then
        F_List = Replace(F_List, ", ", ",")
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & Replace(UCase(F_List), ",", "','") & "')"
    ElseIf InStr(1, F_List, "*") > 0 Then
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) LIKE '%" & Replace(UCase(F_List), "*", "") & "%'"
    Else
        UserDefinedFilters = UserDefinedFilters & " UPPER(bd.f) in ('" & UCase(F_List) & "')"
    End If
End If

'Cat and SubCat Filters
If cat <> "" And subcat <> "" Then
    cat = Replace(cat, ",", "','")
    subcat = Replace(subcat, ",", "','")
    BCSFilters = BCSFilters & " AND UPPER(sca.cat) IN ('" & cat & "')" & _
    vbNewLine & "AND UPPER(sca.subcat) in ('" & subcat & "')"
    
ElseIf cat <> "" And subcat = "" Then
    cat = Replace(cat, ",", "','")
    BCSFilters = BCSFilters & " AND UPPER(sca.cat) IN ('" & cat & "')"
   
ElseIf cat = "" And subcat <> "" Then
    subcat = Replace(subcat, ",", "','")
    BCSFilters = BCSFilters & " AND UPPER(sca.subcat) IN ('" & subcat & "')"
End If

以上只是两组,但它应该让您了解我必须做什么来构建 where 子句。

我找不到使用 ADODB 让记录集工作的方法,我不确定这是否会更快。如果可能的话,我需要做这个 DSNless,因为该文件被广泛的用户使用。任何人都可以想到的任何事情都可能有助于减少查询中的大量时间?

编辑:

添加我为记录集尝试的代码:

Dim conn As Object
Dim rs As Object

Set conn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
CS = "Driver={Amazon Redshift (x64)};DATA SOURCE={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;UID=user;PWD=fakepasswrod;sslmode=require"

conn.Open CS

Set RegAtt = ThisWorkbook.Sheets(Sheet6.Name)
RegAtt.Cells.Clear
RegSql = "SELECT cl,reg,curr FROM schema.table1"

rs.Open RegSql

With RegAtt.ListObjects.Add(xlSrcQuery, rs, Destination:=RegAtt.Range("$A$1")).QueryTable
    '.CommandText = RegSql
    .RefreshStyle = xlInsertDeleteCells
    .AdjustColumnWidth = True
    .ListObject.DisplayName = "LO_2"
    .Refresh BackgroundQuery:=False
End With

该连接字符串我得到一个驱动程序未找到错误。

这个CS = "Driver={Amazon Redshift (x64)};SERVER={RS1.us-east-1.redshift.amazonaws.com};PORT=8192;DATABASE=db1;ID=user;PASSWORD=fakepasswrod;sslmode=require"

我得到 3709 - 连接不能用于执行此操作。在这种情况下,它要么是关闭的,要么是无效的。

标签: excelvba

解决方案


可能是这条线.AdjustColumnWidth = True导致了性能下降?(因为它必须加载数据以确定自动宽度)。

Application.ScreenUpdating您是否考虑过使用set toFalseApplication.Calculationset to 来执行大部分代码xlCalculationManual

有关详细信息,请参阅https://www.microsoft.com/en-us/microsoft-365/blog/2009/03/12/excel-vba-performance-coding-best-practices/

这可能值得一试,看看它是否能提高性能。如果是这样:

  • 您可以在屏幕更新被禁用期间将一些适当的用户显示消息放在适当的位置。
  • 好的做法是存储然后恢复 and 的值ScreenUpdatingCalculation以便环境保持在子例程开始时的状态

推荐阅读