首页 > 解决方案 > 在 Django ORM 中重用子查询进行排序

问题描述

我经营一家狗沙龙,狗很少理发。为了鼓励业主回来,我想为他们下次访问发送优惠券。优惠券将基于狗在过去 2 个月至 2 年内是否理发。超过 2 年之前,我们可以假设客户已经丢失并且不到 2 个月前与他们之前的发型太接近了。我们将首先针对最近访问过的所有者。

我的底层数据库是 PostgreSQL。

from datetime import timedelta
from django.db import models
from django.db.models import Max, OuterRef, Subquery
from django.utils import timezone


# Dogs have one owner, owners can have many dogs, dogs can have many haircuts

class Owner(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    name = models.CharField(max_length=255)


class Dog(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    owner = models.ForeignKey(Owner, on_delete=models.CASCADE, related_name="dogs")
    name = models.CharField(max_length=255)


class Haircut(models.model):
    id = models.UUIDField(primary_key=True, default=uuid.uuid4, editable=False)
    dog = models.ForeignKey(Dog, on_delete=models.CASCADE, related_name="haircuts")
    at = models.DateField()


today = timezone.now().date()
start = today - timedelta(years=2)
end = today - timedelta(months=2)

令我震惊的是,问题可以分解为两个查询。第一个是汇总所有者的狗在过去 2 个月到 2 年内最近剪掉的东西。

dog_aggregate = Haircut.objects.annotate(Max("at")).filter(at__range=(start, end))

然后将结果加入到所有者表中。

owners_by_shaggiest_dog_1 = Owner.objects # what's the rest of this?

导致 SQL 类似于:

select
  owner.id,
  owner.name
from
  (
    select
      dog.owner_id,
      max(haircut.at) last_haircut
    from haircut
      left join dog on haircut.dog_id = dog.id
    where
      haircut.at
        between current_date - interval '2' year
            and current_date - interval '2' month
    group by
      dog.owner_id
  ) dog_aggregate
  left join owner on dog_aggregate.owner_id = owner.id
order by
  dog_aggregate.last_haircut asc,
  owner.name;

通过一些玩耍,我设法得到了正确的结果:

haircut_annotation = Subquery(
    Haircut.objects
    .filter(dog__owner=OuterRef("pk"), at__range=(start, end))
    .order_by("-at")
    .values("at")[:1]
)

owners_by_shaggiest_dog_2 = (
    Owner.objects
    .annotate(last_haircut=haircut_annotation)
    .order_by("-last_haircut", "name")
)

但是,生成的 SQL 似乎效率低下,因为对每一行都执行了一个新查询:

select
  owner.id,
  owner.name,
  (
    select
    from haircut
      inner join dog on haircut.dog_id = dog.id
    where haircut.at
            between current_date - interval '2' year
                and current_date - interval '2' month
      and dog.owner_id = (owner.id)
    order by
      haircut.at asc
    limit 1
  ) last_haircut
from
  owner
order by
  last_haircut asc,
  owner.name;

PS我实际上并没有经营狗沙龙,所以我不能给你代金券。对不起!

标签: djangopostgresqldjango-models

解决方案


鉴于我理解正确,您可以进行如下查询:

from django.db.models import Max

Owners.objects.filter(
    dogs__haircuts__at__range=(start, end)
).annotate(
    last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')

最后一次理发应该是最Max重要的,因为随着时间的推移,时间戳会变大。

但是请注意,您的查询和此查询不排除最近洗过的狗的主人。我们在计算时根本没有考虑到这一点last_haircut

如果要排除此类所有者,则应构建如下查询:

from django.db.models import Max

Owners.objects.exclude(
    dogs__haircuts__at__gt=end
).filter(
    dogs__haircuts__at__range=(start, end)
).annotate(
    last_haircut=Max('dogs__haircuts__at')
).order_by('last_haircut', 'name')

推荐阅读