Python 如何过滤对象以在 Django 中进行计数注释?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/30752268/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to filter objects for count annotation in Django?
提问by rudyryk
Consider simple Django models Event
and Participant
:
考虑简单的 Django 模型Event
和Participant
:
class Event(models.Model):
title = models.CharField(max_length=100)
class Participant(models.Model):
event = models.ForeignKey(Event, db_index=True)
is_paid = models.BooleanField(default=False, db_index=True)
It's easy to annotate events query with total number of participants:
使用参与者总数注释事件查询很容易:
events = Event.objects.all().annotate(participants=models.Count('participant'))
How to annotate with count of participants filtered by is_paid=True
?
如何使用筛选的参与者计数进行注释is_paid=True
?
I need to query all eventsregardless of number of participants, e.g. I don't need to filter by annotated result. If there are 0
participants, that's ok, I just need 0
in annotated value.
无论参与者有多少,我都需要查询所有事件,例如,我不需要按带注释的结果进行过滤。如果有0
参与者,那没关系,我只需要带0
注释的值。
The example from documentationdoesn't work here, because it excludes objects from query instead of annotating them with 0
.
文档中的示例在此处不起作用,因为它从查询中排除对象,而不是使用0
.
Update.Django 1.8 has new conditional expressions feature, so now we can do like this:
更新。Django 1.8 有新的条件表达式功能,所以现在我们可以这样做:
events = Event.objects.all().annotate(paid_participants=models.Sum(
models.Case(
models.When(participant__is_paid=True, then=1),
default=0,
output_field=models.IntegerField()
)))
Update 2.Django 2.0 has new Conditional aggregationfeature, see the accepted answerbelow.
采纳答案by Oli
Conditional aggregationin Django 2.0 allows you to further reduce the amount of faff this has been in the past. This will also use Postgres' filter
logic, which is somewhat faster than a sum-case (I've seen numbers like 20-30% bandied around).
Django 2.0 中的条件聚合允许您进一步减少过去的 faff 数量。这也将使用 Postgres 的filter
逻辑,它比 sum-case 快一些(我见过像 20-30% 这样的数字)。
Anyway, in your case, we're looking at something as simple as:
无论如何,在你的情况下,我们正在研究一些简单的事情:
from django.db.models import Q, Count
events = Event.objects.annotate(
paid_participants=Count('participants', filter=Q(participants__is_paid=True))
)
There's a separate section in the docs about filtering on annotations. It's the same stuff as conditional aggregation but more like my example above. Either which way, this is a lot healthier than the gnarly subqueries I was doing before.
文档中有一个单独的部分关于过滤注释。它与条件聚合相同,但更像我上面的示例。无论哪种方式,这都比我之前做的粗糙的子查询健康得多。
回答by Todor
UPDATE
更新
The sub-query approach which I mention is now supported in Django 1.11 via subquery-expressions.
我提到的子查询方法现在通过subquery-expressions在 Django 1.11 中得到支持。
Event.objects.annotate(
num_paid_participants=Subquery(
Participant.objects.filter(
is_paid=True,
event=OuterRef('pk')
).values('event')
.annotate(cnt=Count('pk'))
.values('cnt'),
output_field=models.IntegerField()
)
)
I prefer this over aggregation (sum+case), because it should be faster and easier to be optimized (with proper indexing).
我更喜欢这个而不是聚合(sum+case),因为它应该更快更容易优化(使用适当的索引)。
For older version, the same can be achieved using .extra
对于旧版本,同样可以使用 .extra
Event.objects.extra(select={'num_paid_participants': "\
SELECT COUNT(*) \
FROM `myapp_participant` \
WHERE `myapp_participant`.`is_paid` = 1 AND \
`myapp_participant`.`event_id` = `myapp_event`.`id`"
})
回答by rudyryk
Just discovered that Django 1.8 has new conditional expressions feature, so now we can do like this:
刚刚发现 Django 1.8 有新的条件表达式功能,所以现在我们可以这样做:
events = Event.objects.all().annotate(paid_participants=models.Sum(
models.Case(
models.When(participant__is_paid=True, then=1),
default=0, output_field=models.IntegerField()
)))
回答by Raffi
I would suggest to use the .values
method of your Participant
queryset instead.
我建议改用.values
您的查询集的方法Participant
。
For short, what you want to do is given by:
简而言之,您想要做的是:
Participant.objects\
.filter(is_paid=True)\
.values('event')\
.distinct()\
.annotate(models.Count('id'))
A complete example is as follow:
一个完整的例子如下:
Create 2
Event
s:event1 = Event.objects.create(title='event1') event2 = Event.objects.create(title='event2')
Add
Participant
s to them:part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\ for _ in range(10)] part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\ for _ in range(50)]
Group all
Participant
s by theirevent
field:Participant.objects.values('event') > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>
Here distinct is needed:
Participant.objects.values('event').distinct() > <QuerySet [{'event': 1}, {'event': 2}]>
What
.values
and.distinct
are doing here is that they are creating two buckets ofParticipant
s grouped by their elementevent
. Note that those buckets containParticipant
.You can then annotate those buckets as they contain the set of original
Participant
. Here we want to count the number ofParticipant
, this is simply done by counting theid
s of the elements in those buckets (since those areParticipant
):Participant.objects\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>
Finally you want only
Participant
with ais_paid
beingTrue
, you may just add a filter in front of the previous expression, and this yield the expression shown above:Participant.objects\ .filter(is_paid=True)\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
创建 2
Event
秒:event1 = Event.objects.create(title='event1') event2 = Event.objects.create(title='event2')
Participant
向它们添加s:part1l = [Participant.objects.create(event=event1, is_paid=((_%2) == 0))\ for _ in range(10)] part2l = [Participant.objects.create(event=event2, is_paid=((_%2) == 0))\ for _ in range(50)]
将所有
Participant
s 按其event
领域分组:Participant.objects.values('event') > <QuerySet [{'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 1}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, {'event': 2}, '...(remaining elements truncated)...']>
这里需要 distinct :
Participant.objects.values('event').distinct() > <QuerySet [{'event': 1}, {'event': 2}]>
什么
.values
和.distinct
正在做的事情是,他们正在创造的两个水桶Participant
用元的分组小号event
。请注意,这些存储桶包含Participant
.然后您可以注释这些存储桶,因为它们包含原始
Participant
. 这里我们要计算 的数量Participant
,这只是通过计算id
这些桶中元素的s来完成的(因为那些是Participant
):Participant.objects\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 10}, {'event': 2, 'id__count': 50}]>
最后你只想要
Participant
一个is_paid
beingTrue
,你可以在前面的表达式前面添加一个过滤器,这会产生上面显示的表达式:Participant.objects\ .filter(is_paid=True)\ .values('event')\ .distinct()\ .annotate(models.Count('id')) > <QuerySet [{'event': 1, 'id__count': 5}, {'event': 2, 'id__count': 25}]>
The only drawback is that you have to retrieve the Event
afterwards as you only have the id
from the method above.
唯一的缺点是您必须在Event
事后检索 ,因为您只有id
上述方法中的 。
回答by Arindam Roychowdhury
What result I am looking for:
我正在寻找什么结果:
- People (assignee) who have tasks added to a report. - Total Unique count of People
- People who have tasks added to a report but, for task whose billability is more than 0 only.
- 将任务添加到报表的人员(受让人)。- 总唯一人数
- 将任务添加到报表的人员,但仅适用于可计费性大于 0 的任务。
In general, I would have to use two different queries:
一般来说,我将不得不使用两个不同的查询:
Task.objects.filter(billable_efforts__gt=0)
Task.objects.all()
But I want both in one query. Hence:
但我想要在一个查询中同时进行。因此:
Task.objects.values('report__title').annotate(withMoreThanZero=Count('assignee', distinct=True, filter=Q(billable_efforts__gt=0))).annotate(totalUniqueAssignee=Count('assignee', distinct=True))
Result:
结果:
<QuerySet [{'report__title': 'TestReport', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}, {'report__title': 'Utilization_Report_April_2019', 'withMoreThanZero': 37, 'totalUniqueAssignee': 50}]>