首页 > 解决方案 > 在后续注解中使用注解值会引发 FieldError

问题描述

我正在尝试做的事情:

我有模型TopicEntry. Entry有一个外键主题。我需要在用户有条目的情况下列出主题(在过去 24 小时内创建)。我还需要注释计数,它需要是在用户写入最后一个条目之后创建的条目总数。(更彻底地说,您可以想到一个收件箱,其中包含未读邮件数量的对话列表。)

这就是我想出的:

relevant_topics = (
    Entry.objects.filter(author=user, date_created__gte=time_threshold(hours=24))
    .values_list("topic__pk", flat=True)
    .order_by()
    .distinct()
)

qs = (
    Topic.objects.filter(pk__in=relevant_topics).annotate(
        latest=Max("entries__date_created", filter=Q(entries__author=user)),
        count=Count("entries", filter=Q(date_created__gte=F("latest__date_created"))),
    )
).values("title", "count")

哪个会抛出:

FieldError: Cannot resolve keyword 'date_created' into field. Join on 'latest' not permitted.

我真的不知道 Django 本身是否不支持我写的内容,或者我的解决方案有问题。我想使用 .extra() 添加计数,但我不知道如何在latest那里使用注释。我真的很感激任何产生预期输出的查询。

参考数据集:

(assume the current user = Jack)

<User username: Jack>
<User username: John>

<Topic title: foo>
<Topic title: bar>
<Topic title: baz>

(Assume higher pk = created later.)

<Entry pk:1 topic:foo user:Jack>
<Entry pk:2 topic:foo user:Jack> (date_created in last 24 hours)
<Entry pk:3 topic:foo user:John> (date_created in last 24 hours)

<Entry pk:4 topic:bar user:Jack> (date_created in last 24 hours)

<Entry pk:5 topic:baz user:John> (date_created in last 24 hours)

Given the dataset, the output should only be:

<Topic:foo count:1>

编辑:

为了给你一个想法,这里是一个产生正确输出的原始 SQL 解决方案:

    pk = user.pk
    threshold = time_threshold(hours=24)

    with connection.cursor() as cursor:
        cursor.execute(
            """
        select
          s.title,
          s.slug,
          s.count
        from
          (
            select
              tt.title,
              tt.slug,
              e.count,
              e.max_id
            from
              (
                select
                  z.topic_id,
                  count(
                    case when z.id > k.max_id then z.id end
                  ) as count,
                  k.max_id
                from
                  dictionary_entry z
                  inner join (
                    select
                      topic_id,
                      max(de.id) as max_id
                    from
                      dictionary_entry de
                    where
                      de.date_created >= %s
                      and de.author_id = %s
                    group by
                      author_id,
                      topic_id
                  ) k on k.topic_id = z.topic_id
                group by
                  z.topic_id,
                  k.max_id
              ) e
              inner join dictionary_topic tt on tt.id = e.topic_id
          ) s
        where
          s.count > 0
        order by
          s.max_id desc
        """,
            [threshold, pk],
        )
        # convert to dict
        columns = [col[0] for col in cursor.description]
        return [dict(zip(columns, row)) for row in cursor.fetchall()]

标签: pythonsqldjangodjango-orm

解决方案


这可以通过在数据库中的 1 个 SQL 查询中实现

  1. 过滤相关entries(重要的一点是将OuterRef过滤器“转移”到topics),
  2. 分组entriesbytopic和 using count,然后
  3. 注释topics使用 a Subquery

可以在Django docs中找到一些相关信息。

对于您的情况,以下应产生所需的结果。

from django.db.models import Count, IntegerField, OuterRef, Subquery

relevant_topics = (
    models.Entry.objects.filter(
        author=user, date_created__gte=time_threshold(24), topic=OuterRef("pk"),
    )
    .order_by()
    .values("topic")
    .annotate(Count("id"))
    .values("id__count")
)

qs = models.Topic.objects.annotate(
    entries_count=Subquery(relevant_topics, output_field=IntegerField())
).filter(entries_count__gt=0)

希望这可以帮助 :-)

编辑1:

我想我误解了这个问题,忘记考虑到entries需要计算的是其他作者的事实(在当前作者的最后一位之后)。

因此,我想出了以下内容,其结果与@Paul Rene 的答案相同:

latest_in_topic = (
    Entry.objects.filter(author=user, date_created__gte=time_threshold(24), topic=OuterRef("topic"))
    .values("topic")
    .annotate(latest=Max("date_created"))
)

qs = (
    Entry.objects.annotate(
        latest=Subquery(latest_in_topic.values("latest"), output_field=DateTimeField())
    )
    .filter(date_created__gte=F("latest"))
    .values("topic", "topic__title")
    .annotate(Count("id"))
)

res = [(t["topic__title"], t["id__count"]) for t in qs]

编辑 2: ORM 产生以下查询(由 获得str(qs.query))。也许,会有一些线索如何提高性能。

SELECT "entry"."topic_id", "topic"."title", COUNT("entry"."id") AS "id__count"
FROM "entry"
         INNER JOIN "topic" ON ("entry"."topic_id" = "topic"."id")
WHERE "entry"."date_created" > (SELECT MAX(U0."date_created") AS "latest"
                                    FROM "entry" U0
                                    WHERE (U0."author_id" = 1 AND U0."date_created" >= '2020-04-09 16:31:48.407501+00:00' AND U0."topic_id" = ("entry"."topic_id"))
                                    GROUP BY U0."topic_id")
GROUP BY "entry"."topic_id", "topic"."title";

推荐阅读