首页 > 解决方案 > 使用检查查找超集的 Linq 查询

问题描述

我想查询一个包含一些重复文本的列表,并且有/没有 id 值。

我有两个条件要匹配,

  1. 选择任何 2 个具有 ID 的相同[text, type]组,否则选择唯一的一个。
  2. 任意 2 个文本,其中一个包含另一个,则选择超集文本。[唱歌,跳舞唱歌=>跳舞唱歌]
Type      Text            Id

Name      John  
Name      John            22
Name      John Smith      2548
Hobby     Singing         
Hobby     Dancing Singing
School    XYZ             
School    XYZ             242

预期输出:

Type      Text            Id

Name      John Smith      2548
Hobby     Dancing Singing
School    XYZ             242

标签: c#linq

解决方案


这很丑陋,但它有效:

class Program
{
    static void Main(string[] args)
    {
        
        List<Record> records = BuildTestData();
        List<Record> deduped = DeDupe(records);
        Console.Clear();

        foreach (Record r in deduped)
            Console.WriteLine($"Type:{r.Typ}, Text:{r.Txt}, ID:{r.ID} ");

        Console.ReadKey();
    }


    static List<Record> DeDupe(List<Record> dupes)
    {
        List<Record> excludes = new List<Record>();
        excludes.AddRange(dupes.GroupBy(x => new { x.Typ, x.Txt }).Where(y => y.Count() > 1).SelectMany(z => z.Where(a => string.IsNullOrEmpty(a.ID))));
        excludes.AddRange(dupes.Where(x => !excludes.Any(y => x == y) && dupes.Any(z => x != z && x.Txt != z.Txt && z.Txt.Contains(x.Txt))));
        return dupes.Where(x => !excludes.Any(y => x == y)).ToList();
    }


    static List<Record> BuildTestData()
    {
        return new List<Record>
        {
            new Record { Typ = "Name", Txt = "John", ID = null},
            new Record { Typ = "Name", Txt = "John", ID = "22"},
            new Record { Typ = "Name", Txt = "John Smith", ID = "2548"},
            new Record { Typ = "Hobby", Txt = "Singing", ID = null},
            new Record { Typ = "Hobby", Txt = "Dancing Singing", ID = null},
            new Record { Typ = "School", Txt = "XYZ", ID = null},
            new Record { Typ = "School", Txt = "XYZ", ID = "242"},
        };
    }
}

public class Record
{ 
    public string Typ { get; set; }
    public string Txt { get; set; }
    public string ID { get; set; }
}

推荐阅读