首页 > 解决方案 > DAX 在字符串中搜索多个值

问题描述

我需要创建一个新的 DAX 列,该列将从同一个表中的另一列中搜索字符串。它将搜索第二个表中的任何值,如果找到任何这些值,则返回 True。简化示例:

假设我有一个名为 Sentences 的表,其中包含 1 列:

Sentences
Col1
----------------
"The aardvark admitted it was wrong"
"The attractive peanut farmer graded the term paper"
"The awning was too tall to touch"

还有另一个名为 FindTheseWords 的表,其中包含值列表

FindTheseWords 
Col1
----------------
peanut
aardvark

我将在 Sentences 表中创建 Col2,它应该返回

Sentences
Col1                                                    Col2
----------------------------------------------------    ------------------------
"The aardvark admitted it was wrong"                    TRUE
"The attractive peanut farmer graded the term paper"    TRUE
"The awning was too tall to touch"                      FALSE

FindTheseWords 的列表实际上很长,所以我不能只对它们进行硬编码并使用 OR。我需要参考表格。我不关心空格,所以带有“花生”的句子也会返回真。

我在 M 中看到了一个很好的实现,但是我的负载性能受到了很好的影响,所以我希望为新列找到一个 DAX 选项。M解决方案,供参考:如何在一个字符串中搜索多个字符串?

标签: exceldaxpowerpivot

解决方案


事实表

| Column1                                              |
|------------------------------------------------------|
| The aardvark admitted it was   wrong                 |
| The attractive peanut farmer   graded the term paper |
| The awning was too tall to   touch                   |
| This is  text string                                 |
| Tester is needed                                     |

句子表

| Column1    |
|------------|
| attractive |
| peanut     |
| aardvark   |
| Tester     |

计算列

Column =
VAR _1 =
    ADDCOLUMNS ( 'fact', "newColumn", SUBSTITUTE ( 'fact'[Column1], " ", "|" ) )
VAR _2 =
    GENERATE (
        _1,
        ADDCOLUMNS (
            GENERATESERIES ( 1, PATHLENGTH ( [newColumn] ) ),
            "Words", PATHITEM ( [newColumn], [Value], TEXT )
        )
    )
VAR _3 =
    ADDCOLUMNS (
        _2,
        "test", CONTAINS ( VALUES ( sentence[Column1] ), sentence[Column1], [Words] )
    )
VAR _4 =
    DISTINCT (
        SELECTCOLUMNS (
            FILTER ( _3, [test] = TRUE ),
            "Column1", [Column1] & "",
            "test", [test] & ""
        )
    )
VAR _5 =
    DISTINCT (
        SELECTCOLUMNS (
            FILTER ( _3, [test] = FALSE ),
            "Column1", [Column1] & "",
            "test", [test] & ""
        )
    )
VAR _7 =
    FILTER ( _5, [Column1] = MAXX ( _4, [Column1] ) )
VAR _8 =
    UNION ( _4, _7 )
RETURN
    MAXX (
        FILTER ( _8, [Column1] = CALCULATE ( MAX ( 'fact'[Column1] ) ) ),
        [test]
    )

解决方案


推荐阅读