SQL Django只选择具有重复字段值的行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8989221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Django select only rows with duplicate field values
提问by dragoon
suppose we have a model in django defined as follows:
假设我们在 django 中有一个模型定义如下:
class Literal:
name = models.CharField(...)
...
Name field is not unique, and thus can have duplicate values. I need to accomplish the following task:
Select all rows from the model that have at least one duplicate valueof the name
field.
Name 字段不是唯一的,因此可以有重复的值。我需要完成以下任务:从模型中选择至少具有一个重复name
字段值的所有行。
I know how to do it using plain SQL (may be not the best solution):
我知道如何使用普通 SQL 来实现(可能不是最好的解决方案):
select * from literal where name IN (
select name from literal group by name having count((name)) > 1
);
So, is it possible to select this using django ORM? Or better SQL solution?
那么,是否可以使用 django ORM 来选择它?还是更好的 SQL 解决方案?
回答by Chris Pratt
Try:
尝试:
from django.db.models import Count
Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
This is as close as you can get with Django. The problem is that this will return a ValuesQuerySet
with only name
and count
. However, you can then use this to construct a regular QuerySet
by feeding it back into another query:
这与 Django 最接近。问题是这将返回一个ValuesQuerySet
只有name
和count
。但是,您可以使用QuerySet
它通过将其反馈给另一个查询来构建正则:
dupes = Literal.objects.values('name')
.annotate(Count('id'))
.order_by()
.filter(id__count__gt=1)
Literal.objects.filter(name__in=[item['name'] for item in dupes])
回答by Piper Merriam
This was rejected as an edit. So here it is as a betteranswer
这被拒绝作为编辑。所以这里是一个更好的答案
dups = (
Literal.objects.values('name')
.annotate(count=Count('id'))
.values('name')
.order_by()
.filter(count__gt=1)
)
This will return a ValuesQuerySet
with all of the duplicate names. However, you can then use this to construct a regular QuerySet
by feeding it back into another query. The django ORM is smart enough to combine these into a single query:
这将返回ValuesQuerySet
带有所有重复名称的 a。但是,您可以使用QuerySet
它通过将其反馈回另一个查询来构建正则。django ORM 足够聪明,可以将这些组合成一个查询:
Literal.objects.filter(name__in=dups)
The extra call to .values('name')
after the annotate call looks a little strange. Without this, the subquery fails. The extra values tricks the ORM into only selecting the name column for the subquery.
.values('name')
annotate 调用之后的额外调用看起来有点奇怪。没有这个,子查询就会失败。额外的值诱使 ORM 只为子查询选择名称列。
回答by JamesO
try using aggregation
尝试使用聚合
Literal.objects.values('name').annotate(name_count=Count('name')).exclude(name_count=1)
回答by Eugene Pakhomov
In case you use PostgreSQL, you can do something like this:
如果您使用 PostgreSQL,您可以执行以下操作:
from django.contrib.postgres.aggregates import ArrayAgg
from django.db.models import Func, Value
duplicate_ids = (Literal.objects.values('name')
.annotate(ids=ArrayAgg('id'))
.annotate(c=Func('ids', Value(1), function='array_length'))
.filter(c__gt=1)
.annotate(ids=Func('ids', function='unnest'))
.values_list('ids', flat=True))
It results in this rather simple SQL query:
它导致这个相当简单的 SQL 查询:
SELECT unnest(ARRAY_AGG("app_literal"."id")) AS "ids"
FROM "app_literal"
GROUP BY "app_literal"."name"
HAVING array_length(ARRAY_AGG("app_literal"."id"), 1) > 1
回答by user2959723
If you want to result only names list but not objects, you can use the following query
如果只想生成名称列表而不生成对象,可以使用以下查询
repeated_names = Literal.objects.values('name').annotate(Count('id')).order_by().filter(id__count__gt=1).values_list('name', flat='true')