Python 如何过滤对象以在 Django 中进行计数注释?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30752268/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 08:54:10  来源:igfitidea点击:

How to filter objects for count annotation in Django?

pythondjangodjango-modelsdjango-aggregation

提问by rudyryk

Consider simple Django models Eventand Participant:

考虑简单的 Django 模型EventParticipant

class Event(models.Model):
    title = models.CharField(max_length=100)

class Participant(models.Model):
    event = models.ForeignKey(Event, db_index=True)
    is_paid = models.BooleanField(default=False, db_index=True)

It's easy to annotate events query with total number of participants:

使用参与者总数注释事件查询很容易:

events = Event.objects.all().annotate(participants=models.Count('participant'))

How to annotate with count of participants filtered by is_paid=True?

如何使用筛选的参与者计数进行注释is_paid=True

I need to query all eventsregardless of number of participants, e.g. I don't need to filter by annotated result. If there are 0participants, that's ok, I just need 0in annotated value.

无论参与者有多少,我都需要查询所有事件,例如,我不需要按带注释的结果进行过滤。如果有0参与者,那没关系,我只需要带0注释的值。

The example from documentationdoesn't work here, because it excludes objects from query instead of annotating them with 0.

文档中的示例在此处不起作用,因为它从查询中排除对象,而不是使用0.

Update.Django 1.8 has new conditional expressions feature, so now we can do like this:

更新。Django 1.8 有新的条件表达式功能,所以现在我们可以这样做:

events = Event.objects.all().annotate(paid_participants=models.Sum(
    models.Case(
        models.When(participant__is_paid=True, then=1),
        default=0,
        output_field=models.IntegerField()
    )))

Update 2.Django 2.0 has new Conditional aggregationfeature, see the accepted answerbelow.

更新 2.Django 2.0 具有新的条件聚合功能,请参阅下面已接受的答案

采纳答案by Oli

Conditional aggregationin Django 2.0 allows you to further reduce the amount of faff this has been in the past. This will also use Postgres' filterlogic, which is somewhat faster than a sum-case (I've seen numbers like 20-30% bandied around).

Django 2.0 中的条件聚合允许您进一步减少过去的 faff 数量。这也将使用 Postgres 的filter逻辑,它比 sum-case 快一些(我见过像 20-30% 这样的数字)。

Anyway, in your case, we're looking at something as simple as:

无论如何,在你的情况下,我们正在研究一些简单的事情:

from django.db.models import Q, Count
events = Event.objects.annotate(
    paid_participants=Count('participants', filter=Q(participants__is_paid=True))
)

There's a separate section in the docs about filtering on annotations. It's the same stuff as conditional aggregation but more like my example above. Either which way, this is a lot healthier than the gnarly subqueries I was doing before.

文档中有一个单独的部分关于过滤注释。它与条件聚合相同,但更像我上面的示例。无论哪种方式,这都比我之前做的粗糙的子查询健康得多。

回答by Todor

UPDATE

更新

The sub-query approach which I mention is now supported in Django 1.11 via subquery-expressions.

我提到的子查询方法现在通过subquery-expressions在 Django 1.11 中得到支持。

Event.objects.annotate(
    num_paid_participants=Subquery(
        Participant.objects.filter(
            is_paid=True,
            event=OuterRef('pk')
        ).values('event')
        .annotate(cnt=Count('pk'))
        .values('cnt'),
        output_field=models.IntegerField()
    )
)

I prefer this over aggregation (sum+case), because it should be faster and easier to be optimized (with proper indexing).

我更喜欢这个而不是聚合(sum+case),因为它应该更快更容易优化(使用适当的索引)

For older version, the same can be achieved using .extra

对于旧版本,同样可以使用 .extra

Event.objects.extra(select={'num_paid_participants': "\
    SELECT COUNT(*) \
    FROM `myapp_participant` \
    WHERE `myapp_participant`.`is_paid` = 1 AND \
            `myapp_participant`.`event_id` = `myapp_event`.`id`"
})

回答by rudyryk

Just discovered that Django 1.8 has new conditional expressions feature, so now we can do like this:

刚刚发现 Django 1.8 有新的条件表达式功能,所以现在我们可以这样做:

events = Event.objects.all().annotate(paid_participants=models.Sum(
    models.Case(
        models.When(participant__is_paid=True, then=1),
        default=0, output_field=models.IntegerField()
    )))

回答by Raffi

I would suggest to use the .valuesmethod of your Participantqueryset instead.

我建议改用.values您的查询集的方法Participant

For short, what you want to do is given by:

简而言之,您想要做的是:

Participant.objects\
    .filter(is_paid=True)\
    .values('event')\
    .distinct()\
    .annotate(models.Count('id'))

A complete example is as follow:

一个完整的例子如下:

  1. Create 2 Events:

    event1 = Event.objects.create(title='event1')
    event2 = Event.objects.create(title='event2')
    
  2. Add Participants to them:

    part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\
              for _ in range(10)]
    part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\
              for _ in range(50)]
    
  3. Group all Participants by their eventfield:

    Participant.objects.values('event')
    > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>
    

    Here distinct is needed:

    Participant.objects.values('event').distinct()
    > <QuerySet [{'event': 1}, {'event': 2}]>
    

    What .valuesand .distinctare doing here is that they are creating two buckets of Participants grouped by their element event. Note that those buckets contain Participant.

  4. You can then annotate those buckets as they contain the set of original Participant. Here we want to count the number of Participant, this is simply done by counting the ids of the elements in those buckets (since those are Participant):

    Participant.objects\
        .values('event')\
        .distinct()\
        .annotate(models.Count('id'))
    > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>
    
  5. Finally you want only Participantwith a is_paidbeing True, you may just add a filter in front of the previous expression, and this yield the expression shown above:

    Participant.objects\
        .filter(is_paid=True)\
        .values('event')\
        .distinct()\
        .annotate(models.Count('id'))
    > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
    
  1. 创建 2Event秒:

    event1 = Event.objects.create(title='event1')
    event2 = Event.objects.create(title='event2')
    
  2. Participant向它们添加s:

    part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\
              for _ in range(10)]
    part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\
              for _ in range(50)]
    
  3. 将所有Participants 按其event领域分组:

    Participant.objects.values('event')
    > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>
    

    这里需要 distinct :

    Participant.objects.values('event').distinct()
    > <QuerySet [{'event': 1}, {'event': 2}]>
    

    什么.values.distinct正在做的事情是,他们正在创造的两个水桶Participant用元的分组小号event。请注意,这些存储桶包含Participant.

  4. 然后您可以注释这些存储桶,因为它们包含原始Participant. 这里我们要计算 的数量Participant,这只是通过计算id这些桶中元素的s来完成的(因为那些是Participant):

    Participant.objects\
        .values('event')\
        .distinct()\
        .annotate(models.Count('id'))
    > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>
    
  5. 最后你只想要Participant一个is_paidbeing True,你可以在前面的表达式前面添加一个过滤器,这会产生上面显示的表达式:

    Participant.objects\
        .filter(is_paid=True)\
        .values('event')\
        .distinct()\
        .annotate(models.Count('id'))
    > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
    

The only drawback is that you have to retrieve the Eventafterwards as you only have the idfrom the method above.

唯一的缺点是您必须在Event事后检索 ,因为您只有id上述方法中的 。

回答by Arindam Roychowdhury

What result I am looking for:

我正在寻找什么结果:

  • People (assignee) who have tasks added to a report. - Total Unique count of People
  • People who have tasks added to a report but, for task whose billability is more than 0 only.
  • 将任务添加到报表的人员(受让人)。- 总唯一人数
  • 将任务添加到报表的人员,但仅适用于可计费性大于 0 的任务。

In general, I would have to use two different queries:

一般来说,我将不得不使用两个不同的查询:

Task.objects.filter(billable_efforts__gt=0)
Task.objects.all()

But I want both in one query. Hence:

但我想要在一个查询中同时进行。因此:

Task.objects.values('report__title').annotate(withMoreThanZero=Count('assignee', distinct=True, filter=Q(billable_efforts__gt=0))).annotate(totalUniqueAssignee=Count('assignee', distinct=True))

Result:

结果:

<QuerySet [{'report__title': 'TestReport', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}, {'report__title': 'Utilization_Report_April_2019', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}]>