首页 > 解决方案 > Load filtered hierarchy from self-referencing table in EF Core

问题描述

I have a self-referencing entity Comment in my project:

public class Comment
{
    public Guid CommentId { get; set; }
    
    public string Content { get; set; }
        
    public Guid? ParentCommentId { get; set; }
    public virtual Comment ParentComment { get; set; }
        
    public virtual ICollection<Comment> Children { get; set; }
}

I am trying to do the following:

await context.Comment
    .Where(c => c.ParentCommentId == null)
    .Include(c => c.Children)
    .ToListAsync();

I want to get all root comments (comments that don't have parents) and load the entire hierarchy of comments.

Want do I want to see in the result?

I have the following comments in the database:

To make it more readable, I represent it in hierarchically order (only Content property):

When executing the query, shown above, I want to get something like this (in JSON):

[
   {
      "commentId":"be02742a-9170-4335-afe7-3c7c22684424",
      "content":"Hello World!",
      "parentCommentId":null,
      "children":[
         {
            "commentId":"59656765-d1ed-4648-8696-7d576ab7419f",
            "content":"Are you a programmer?",
            "parentCommentId":"be02742a-9170-4335-afe7-3c7c22684424",
            "children":[
               {
                  "commentId":"0bb77a43-c7bb-482f-9bf8-55c4050974da",
                  "content":"Sure",
                  "parentCommentId":"59656765-d1ed-4648-8696-7d576ab7419f",
                  "children":[
                     
                  ]
               },
               {
                  "commentId":"b8d61cfd-d274-4dae-a2be-72e08cfa9066",
                  "content":"What?",
                  "parentCommentId":"59656765-d1ed-4648-8696-7d576ab7419f",
                  "children":[
                     
                  ]
               }
            ]
         }
      ]
   },
   {
      "commentId":"cfe126b3-4601-4432-8c87-445c1362a225",
      "content":"I wanna go to Mars too!",
      "parentCommentId":null,
      "children":[
         {
            "commentId":"ab6d6b49-d772-48cd-9477-8d40f133c37a",
            "content":"See you on the Moon :)",
            "parentCommentId":"cfe126b3-4601-4432-8c87-445c1362a225",
            "children":[
               
            ]
         }
      ]
   }
]

But what do I get?

When I execute this query I have the following result:

[
   {
      "commentId":"be02742a-9170-4335-afe7-3c7c22684424",
      "content":"Hello World!",
      "postId":"69f3ca3a-66fc-4142-873d-01e950d83adf",
      "post":null,
      "parentCommentId":null,
      "parentComment":null,
      "commentRates":[
         
      ],
      "inverseParentComment":[
         {
            "commentId":"59656765-d1ed-4648-8696-7d576ab7419f",
            "content":"Are you a programmer?",
            "postId":"69f3ca3a-66fc-4142-873d-01e950d83adf",
            "post":null,
            "parentCommentId":"be02742a-9170-4335-afe7-3c7c22684424",
            "commentRates":[
               
            ],
            "inverseParentComment":[
               
            ]
         }
      ]
   },
   {
      "commentId":"cfe126b3-4601-4432-8c87-445c1362a225",
      "content":"I wanna go to Mars too!",
      "postId":"69f3ca3a-66fc-4142-873d-01e950d83adf",
      "post":null,
      "parentCommentId":null,
      "parentComment":null,
      "commentRates":[
         
      ],
      "inverseParentComment":[
         {
            "commentId":"ab6d6b49-d772-48cd-9477-8d40f133c37a",
            "content":"See you on the Moon :)",
            "postId":"69f3ca3a-66fc-4142-873d-01e950d83adf",
            "post":null,
            "parentCommentId":"cfe126b3-4601-4432-8c87-445c1362a225",
            "commentRates":[
               
            ],
            "inverseParentComment":[
               
            ]
         }
      ]
   }
]

So, I get the following hierarchy:

Instead of this:

Why does it work so? How can I fix this problem?

Working, but dirty solution

List<Comment> allComments = await context.Comment
    .Include(c => c.Children)
    .ToListAsync();

List<Comment> filteredComments = allComments.Where(c => c.ParentCommentId == null);

This works, but it's ugly - we load the full hierarchy of comments from the database into the memory and than filter them. It'll work really slow if the database contains, e.g., 1 million comments.

Update

In my case there is no need to use recursive CTE. See update for this question. It looks like the idea introduced by Greg Ogle, but more simplified.

标签: c#entity-framework-corehierarchyself-referencing-table

解决方案


如果您只是从 Comment 表中加载平面数据,则速度相当快。为了协助查找,您可以使用Dictionary<{Guid or whatever type makes sense}, {Comment object type}>. Dictionary 上的查找使用 HashTable(我很确定)并且速度很快。所以,一旦你有了你的根评论,随后的查找就会很快。

下面是用于演示的控制台应用程序代码。它确实需要几秒钟才能运行。也许进一步优化是可能的。

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Timers;

namespace DictionaryHierarchyTest
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            var stopwatch = new Stopwatch();
            stopwatch.Start();

            var commentsFlat = new List<CommentFlat>();
            var parentLookup = new Dictionary<Guid, List<CommentFlat>>();

            for (var i = 0; i < 100000; i++)
            {
                var id = Guid.NewGuid();
                commentsFlat.Add(new CommentFlat
                {
                    Id = id,
                    ParentId = Guid.Empty,
                    Text = $"Some text for ID:{id}"
                });

                for (var j = 0; j < 5; j++)
                {
                    var childId = Guid.NewGuid();
                    commentsFlat.Add(new CommentFlat
                    {
                        Id = childId,
                        ParentId = id,
                        Text = $"Some text for ID:{childId} Parent ID:{id}"
                    });

                    
                    for (var k = 0; k < 5; k++)
                    {
                        var grandChildId = Guid.NewGuid();
                        commentsFlat.Add(new CommentFlat
                        {
                            Id = grandChildId,
                            ParentId = childId,
                            Text = $"Some text for ID:{grandChildId} Parent ID:{childId}"
                        });
                    }

                }
            }

            foreach (var cf in commentsFlat)
            {
                if (!parentLookup.ContainsKey(cf.ParentId))
                {
                    parentLookup.Add(cf.ParentId, new List<CommentFlat>());
                }

                parentLookup[cf.ParentId].Add(cf);
            }
            stopwatch.Stop();
            var ts = stopwatch.Elapsed;
            Console.WriteLine($"There are {commentsFlat.Count} comments");
            Console.WriteLine($"{String.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)} to randomly populated.");
            stopwatch.Reset();
            stopwatch.Start();

            var commentHierarchy = new List<Comment>();

            foreach (var cf in commentsFlat)
            {
                if (cf.ParentId == Guid.Empty)
                {
                    var comment = new Comment
                    {
                        Id = cf.Id,
                        ParentId = cf.ParentId,
                        Children = BuildHierarchy(cf.Id, parentLookup),
                        Text = cf.Text
                    };
                    commentHierarchy.Add(comment);
                }
            }

            stopwatch.Stop();
            ts = stopwatch.Elapsed;
            Console.WriteLine($"{String.Format("{0:00}:{1:00}:{2:00}.{3:00}", ts.Hours, ts.Minutes, ts.Seconds, ts.Milliseconds / 10)} to make hierarchical");
            //foreach (var c in commentHierarchy) {
            //    PrintHiearchy(c, 1);
            //}
        }
        static string Tabs(int n)
        {
            return new String('\t', n);
        }
        private static void PrintHiearchy(Comment comment, int level) {
            Console.WriteLine($"{Tabs(level)}{comment.Text}");
            foreach(var child in comment.Children){ PrintHiearchy(child, level + 1); }
        }

        private static void OnTimedEvent(Object source, ElapsedEventArgs e)
        {
            Console.WriteLine("The Elapsed event was raised at {0:HH:mm:ss.fff}",
                              e.SignalTime);
        }

        private static List<Comment> BuildHierarchy(Guid parentId, Dictionary<Guid, List<CommentFlat>> flatComments)
        {
            if (flatComments.ContainsKey(parentId))
            {
                return flatComments[parentId].Select(c => new Comment
                {
                    Id = c.Id,
                    ParentId = c.ParentId,
                    Text = c.Text,
                    Children = BuildHierarchy(c.Id, flatComments)
                }).ToList();
            }
            else
            {
                return new List<Comment>();
            }
        }

    }

    public class Comment
    {
        public Guid Id { get; set; }
        public Guid ParentId { get; set; }
        public string Text { get; set; }
        public List<Comment> Children { get; set; }
    }

    public class CommentFlat
    {
        public Guid Id { get; set; }
        public Guid ParentId { get; set; }
        public string Text { get; set; }
    }
}

推荐阅读