首页 > 解决方案 > Django 聚合查询包括零计数

问题描述

在我的 Django 应用程序中,我试图获取所有Student提交的Paper的计数,包括提交 NO 论文的学生(表示为 count=0)。

模型.py

class Student(models.Model):
   idstudent = models.AutoField(primary_key=True)
   student_name = models.CharField(max_length=250, null=False, blank=False, verbose_name='Student Name')

class Paper(models.Model):
   idpaper = models.AutoField(primary_key=True)
   student = models.ForeignKey(Student, on_delete=models.PROTECT, null=False, blank=False)

查询尝试 1:仅返回已提交论文的学生

papers = Paper.objects.order_by('submission_date')
result = papers.values('student', student_name=F('student__student_name')).annotate(count=Count('student')).distinct().order_by('-count')
print(result)       

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 4}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 2}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}]>

查询尝试 2:返回提交 0 篇论文的学生,但其他学生的计数为 1

result = Student.objects.values('pk', student_name=F('student_name'))
    .annotate(
        count=Count(
            'pk',
            filter=Q(pk__in=Paper.objects.values('student')
            )
        )
    )
).order_by('-count')
print(result)

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>

Attempt 2相同,我还尝试了以下使用Sum(Case(,它产生了相同的结果,因为我认识到Attempt 2原始 SQL 实际上使用Case(When了 ,但似乎计算“列表”Student.pk中出现的Paper.objects.values时间(而不考虑它出现了多少次)。

result = Student.objects.values('pk', student_name=F('student_name')).annotate(
    count=Sum(
        Case(
            When(pk__in=Paper.objects.values('student'), then=1),
            default=0, output_field=IntegerField()
        )
    )
)

<QuerySet [{'idstudent': 1, 'student_name': '\nMichael Jordan\n', 'count': 1}, {'idstudent': 2, 'student_name': '\nSteve White\n', 'count': 1}, {'idstudent': 3, 'student_name': '\nHillary Clinton\n', 'count': 1}, , {'idstudent': 4, 'student_name': '\nDoug Funny\n', 'count': 0}, , {'idstudent': 5, 'student_name': '\nSkeeter Valentine\n', 'count': 0}]>

我如何调整我的查询以包括提交 0 篇论文的学生,同时保持正确的学生数量?

标签: djangodjango-viewsdjango-querysetdjango-q

解决方案


与尝试 2 相同,我还尝试了以下使用 Sum(Case( 产生了相同的结果,因为我认识到尝试 2 原始 SQL 实际上使用了 Case(When,但似乎只在 Student.pk 存在时才算在 Paper.objects.values “列表”中(但不考虑它存在的次数)。

要么我不理解问题/问题,但是您的尝试 2 示例将计数过滤为 only Paper.objects.values "list",这样做是正常的吗?

您是否尝试过简单的:

Student.objects.annotate(num_papers=Count('paper'))

如果您想对计数进行额外的过滤,我的建议是使用子查询,这是一个示例:

Student.objects.annotate(
    num_papers=Subquery(
        Paper.objects.filter(student=OuterRef('pk'))
            # The first .values call defines our GROUP BY clause
            # Its important to have a filtration on every field defined here
            # Otherwise you will have more than one row per group!
            # In this example we group only by student
            # and we already filtered by student.
            # any extra filtration you want should be make here too (before the grouping).
            .values('student')
            # Here we say: count how many rows we have per group 
            .annotate(cnt=Count('pk'))
            # Here we say: return only the count
            .values('cnt')
    )
)

推荐阅读