首页 > 解决方案 > Add a running total to a PowerQuery result

问题描述

I am trying to create a count of dates that occur during particular weeks (the last five weeks including this week) and a total running count for dates which would include records prior to the listed five week duration.
I could do this fairly easily using standard Excel formula, but I'm trying to learn PowerQuery and the M language.

My first query returns the relevant rows from the original source data (another query that pulls everything from the Excel table and forces the correct data type for each column).
The dates I'm trying to count are contained in the Section A#(lf)Turnbacks.
The query is called Triage_Turnbacks.

let
    Source = #"Source_Data",
    #"Filter_Rows" = Table.SelectRows(Source, each ([#"Cancelled?"] = null) and ([#"Section A#(lf)Turnbacks"] <> null)),
    #"Removed_Columns" = Table.SelectColumns(Filter_Rows,{"VENDOR CODE", "Supplier Lookup", "Section A#(lf)Turnbacks"}),
    #"Duplicate_Date_Column" = Table.DuplicateColumn(Removed_Columns, "Section A#(lf)Turnbacks", "Start_Of_Week"),
    #"Start_of_Week" = Table.TransformColumns(#"Duplicate_Date_Column",{{"Start_Of_Week", Date.StartOfWeek, type date}})
in
    Start_of_Week   

| VENDOR CODE | Supplier Lookup | Section ATurnbacks | Start_Of_Week |
|-------------|-----------------|--------------------|---------------|
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A1          | Supplier A      | 08/04/2019         | 08/04/2019    |
| A2          | Supplier B      | 16/04/2019         | 15/04/2019    |
| A3          | Supplier C      | 30/05/2019         | 27/05/2019    |
| A10         | Supplier D      | 24/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 18/06/2019         | 17/06/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 24/06/2019         | 24/06/2019    |
| A2          | Supplier B      | 11/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 03/07/2019         | 01/07/2019    |
| A10         | Supplier D      | 02/07/2019         | 01/07/2019    |
| A5          | Supplier E      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 17/07/2019         | 15/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A2          | Supplier B      | 12/07/2019         | 08/07/2019    |
| A1          | Supplier A      | 05/08/2019         | 05/08/2019    |
| A1          | Supplier A      | 29/07/2019         | 29/07/2019    |
| A2          | Supplier B      | 12/06/2019         | 10/06/2019    |
| A3          | Supplier C      | 21/02/2019         | 18/02/2019    |
| A10         | Supplier D      | 23/04/2019         | 22/04/2019    |
| A5          | Supplier E      | 14/06/2019         | 10/06/2019    |
| A10         | Supplier D      | 08/02/2019         | 04/02/2019    |

A second query returns the previous five week Monday and Sunday dates.
The query is called 5_Week_Calendar.

    let
        Source = Table.FromList({-5..0}, each{_}),
        AddedStartOfWeek = Table.AddColumn(Source, "StartOfWeek", each Date.StartOfWeek(Date.AddDays(Date.From(DateTime.LocalNow()),[Column1]*7)), type date),
        AddedEndOfWeek = Table.AddColumn(AddedStartOfWeek, "EndOfWeek", each Date.EndOfWeek([StartOfWeek]), type date),
        RemovedColumn = Table.RemoveColumns(AddedEndOfWeek,{"Column1"})
    in RemovedColumn  

| StartOfWeek | EndOfWeek  |
|-------------|------------|
| 01/07/2019  | 07/07/2019 |
| 08/07/2019  | 14/07/2019 |
| 15/07/2019  | 21/07/2019 |
| 22/07/2019  | 28/07/2019 |
| 29/07/2019  | 04/08/2019 |
| 05/08/2019  | 11/08/2019 |

My third query returns the count per week (i.e. which records in Triage_Turnbacks.Start_Of_Week matches 5_Week_Calendar.StartOfWeek.

let
    Source = Table.NestedJoin(#"5_Week_Calendar", {"StartOfWeek"}, Triage_Turnbacks, {"Start_Of_Week"}, "Triage_Turnbacks", JoinKind.LeftOuter),
    #"Removed Columns" = Table.RemoveColumns(Source,{"EndOfWeek"}),
    #"Aggregated Triage_Turnbacks" = Table.AggregateTableColumn(#"Removed Columns", "Triage_Turnbacks", {{"Start_Of_Week", List.Count, "Total Turnbacks"}})
in
    #"Aggregated Triage_Turnbacks"  

    | StartOfWeek | Total Turnbacks |
    |-------------|-----------------|
    | 01/07/2019  | 2               |
    | 08/07/2019  | 1               |
    | 15/07/2019  | 3               |
    | 22/07/2019  | 1               |
    | 29/07/2019  | 1               |
    | 05/08/2019  | 2               |

The problem

How do I add the running total which includes records before the initial 5 week date of 01/07/2019?

| StartOfWeek | Total Turnbacks | Running Total |
|-------------|-----------------|---------------|
| 01/07/2019  | 2               | 15            |
| 08/07/2019  | 1               | 16            |
| 15/07/2019  | 3               | 19            |
| 22/07/2019  | 1               | 19            |
| 29/07/2019  | 1               | 20            |
| 05/08/2019  | 2               | 22            |

标签: excelpowerquerym

解决方案


运行总计在概念上很简单,但据我所知,M的标准库目前不提供类似List.CumulativeSumTable.CumulativeSum. 这让事情变得有点尴尬,并且可能需要一个自定义函数。

运行总计可以通过以下方式实现:

ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),

其中,给定一个数字列表,返回一个数字列表。

表的等价物可能是下面的函数。(我相信有很多方法可以做到这一点;有些比其他的性能/可扩展性更高。)

TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) =>
    let
        runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
        rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
        leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
        joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
        dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"})
    in dropped,

总的来说,回到你的问题,我认为可以通过下面的代码来实现:

let
    // This is just a function which returns a record. The record itself
    // contains two values:
    //    • a list of dates (which will be the "left join column")
    //    • a replacer function (which conditionally replaces dates)
    GetReplacerAndDates = (n as number, replaceWith as any) as record => 
        let
            startOfCurrentWeek = Date.StartOfWeek(DateTime.Date(DateTime.LocalNow())),
            nMondaysAgo = Date.AddWeeks(startOfCurrentWeek, -n),
            defaultAndLastNWeeks = {replaceWith} & List.Dates(nMondaysAgo, n + 1, #duration(7, 0, 0, 0)),
            conditionalReplacer = (someDate as date) =>
                let
                    startOfWeek = Date.StartOfWeek(someDate),
                    startOfWeekOrReplacement = if startOfWeek >= nMondaysAgo then startOfWeek else replaceWith
                in startOfWeekOrReplacement,
            toReturn = [replacer = conditionalReplacer, values = defaultAndLastNWeeks]
        in toReturn,
    rec = GetReplacerAndDates(5, "Prior period"), // Needs a better variable name.

    // You don't need this below, I only used it to give me a starting point
    // and so that I could verify my answer.
    triageTurnbacks = Table.FromColumns(
        {
            {"A1","A1","A2","A3","A10","A5","A1","A1","A2","A3","A10","A5","A1","A1","A2","A1","A1","A2","A3","A10","A5","A10"},
            {"Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier A","Supplier A","Supplier B","Supplier A","Supplier A","Supplier B","Supplier C","Supplier D","Supplier E","Supplier D"},
            {#date(2019,4,8), #date(2019,4,8), #date(2019,4,16), #date(2019,5,30), #date(2019,4,24), #date(2019,6,18), #date(2019,7,17), #date(2019,6,24), #date(2019,6,11), #date(2019,7,3), #date(2019,7,2), #date(2019,7,17), #date(2019,7,17), #date(2019,8,5), #date(2019,7,12), #date(2019,8,5), #date(2019,7,29), #date(2019,6,12), #date(2019,2,21), #date(2019,4,23), #date(2019,6,14), #date(2019,2,8)}
        },
        type table [Vendor Code = text, Supplier Lookup = text, Section A Turnbacks = date]
    ),
    // I think the single step "startOfWeekColumn" below should probably replace the #"Duplicate_Date_Column"
    // and #"Start_of_Week" steps of your Triage_Turnbacks query.
    // You could also assign rec[replacer] to a variable,
    // just to make the code/call site easier to read.
    startOfWeekColumn = Table.AddColumn(triageTurnbacks, "startOfWeek", each rec[replacer]([Section A Turnbacks]), type any),
    datesToJoin = Table.FromColumns({rec[values]}, {"startOfWeek"}),
    joined = Table.NestedJoin(datesToJoin, "startOfWeek", startOfWeekColumn, "startOfWeek", "Total Turnbacks", JoinKind.LeftOuter),
    reduced = Table.TransformColumns(joined, {{"Total Turnbacks", Table.RowCount, type number}}),

    // Helper functions. I don't think cumulative summation can be achieved natively (as far as I know).
    ListCumulativeSum = (listOfNumbers as list) as list => List.Accumulate(listOfNumbers, {}, (listState, currentItem) => listState & {List.Last(listState, 0) + currentItem}),
    TableCumulativeSum = (someTable as table, columnToSum as text, newColumnName as text) as table =>
        let
            runningTotal = ListCumulativeSum(Table.Column(someTable, columnToSum)),
            rightTable = Table.FromColumns({{0..List.Count(runningTotal)-1}, runningTotal}, {"$rightJoinKey", newColumnName}),
            leftTable = Table.AddIndexColumn(someTable, "$leftJoinKey"),
            joined = Table.Join(leftTable, {"$leftJoinKey"}, rightTable, {"$rightJoinKey"}, JoinKind.FullOuter),
            dropped = Table.RemoveColumns(joined, {"$leftJoinKey", "$rightJoinKey"}),
            changedTypes = Table.TransformColumnTypes(dropped, {{newColumnName, type number}})
        in changedTypes,
    runningTotal = TableCumulativeSum(reduced, "Total Turnbacks", "Running Total")
in
    runningTotal

这给了我这个:

查询输出

我认为这与您的预期输出相匹配,但Total Turnbacks开始的一周的列除外22/07/2019(您拥有1和我拥有的列0)。

您显然可以"Prior period"从最终表格中过滤/删除该行。有关详细信息,请参阅代码中的注释。如果他们没有帮助,请告诉我。


推荐阅读