postgresql 按同一列中的多个项目进行 SQL 过滤

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1330221/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 23:49:06  来源:igfitidea点击:

SQL filtering by multiple items in the same column

sqlpostgresqlfilter

提问by Chris

I've got two tables in SQL, one with a project and one with categories that projects belong to, i.e. the JOIN would look roughly like:

我在 SQL 中有两个表,一个包含一个项目,一个包含项目所属的类别,即 JOIN 大致如下所示:

Project | Category
--------+---------
  Foo   | Apple
  Foo   | Banana
  Foo   | Carrot
  Bar   | Apple
  Bar   | Carrot
  Qux   | Apple
  Qux   | Banana

(Strings replaced with IDs from a higher normal form, obviously, but you get the point here.)

(很明显,字符串替换为来自更高范式的 ID,但您在这里明白了。)

What I want to do is allow filtering such that users can select any number of categories and results will be filtered to items that are members of all the selected categories. For example, if a user selects categories "Apple" and "Banana", projects "Foo" and "Qux" show up. If a user select categories "Apple", "Banana", and "Carrot" then only the "Foo" project shows up.

我想要做的是允许过滤,以便用户可以选择任意数量的类别,结果将被过滤为所有选定类别的成员的项目。例如,如果用户选择类别“Apple”和“Banana”,则会显示项目“Foo”和“Qux”。如果用户选择类别“Apple”、“Banana”和“Carrot”,则仅显示“Foo”项目。

The first thing I tried was a simple SELECT DISTINCT Project FROM ... WHERE Category = 'Apple' AND Category = 'Banana', but of course that doesn't work since Apple and Banana show up in the same column in two different rows for any common project.

我尝试的第一件事是一个简单的 SELECT DISTINCT Project FROM ... WHERE Category = 'Apple' AND Category = 'Banana',但当然这不起作用,因为 Apple 和 Banana 出现在同一列的两个不同行中对于任何常见的项目。

GROUP BY and HAVING don't do me any good, so tell me: is there an obvious way to do this that I'm missing, or is it really so complicated that I'm going to have to resort to recursive joins?

GROUP BY 和 HAVING 对我没有任何好处,所以告诉我:有没有明显的方法可以做到这一点,我缺少它,或者它真的太复杂了,我将不得不求助于递归连接?

This is in PostgreSQL, by the way, but of course standard SQL code is always preferable when possible.

顺便说一下,这是在 PostgreSQL 中,但当然,如果可能,标准 SQL 代码总是更可取的。

回答by Quassnoi

See this article in my blog for performance details:

有关性能详细信息,请参阅我博客中的这篇文章:



The solution below:

下面的解决方案:

  • Works on any number of categories

  • Is more efficient that COUNTand GROUP BY, since it checks existence of any project / category pair exactly once, without counting.

  • 适用于任意数量的类别

  • COUNTand更有效GROUP BY,因为它只检查一次任何项目/类别对的存在,而不计算。

­

­

SELECT  *
FROM    (
        SELECT  DISTINCT Project
        FROM    mytable
        ) mo
WHERE   NOT EXISTS
        (
        SELECT  NULL
        FROM    (
                SELECT  'Apple' AS Category
                UNION ALL
                SELECT   'Banana'
                UNION ALL
                SELECT   'Carrot'
                ) list
        WHERE   NOT EXISTS
                (
                SELECT  NULL
                FROM    mytable mii
                WHERE   mii.Project = mo.Project
                        AND mii.Category = list.Category
                )
        )

回答by derobert

Since a project can only be in a category once, we can use COUNT to pull this stunt off:

由于一个项目只能在一个类别中出现一次,我们可以使用 COUNT 来完成这个噱头:

SELECT project, COUNT(category) AS cat_count
  FROM /* your join */
  WHERE category IN ('apple', 'banana')
  GROUP BY project
  HAVING cat_count = 2

A project with a category of only apple or banana will get a count of 1, and thus fail the HAVINGclause. Only a project with both categories will get a count of 2.

类别只有苹果或香蕉的项目将计数为 1,因此该HAVING子句失败。只有同时具有这两个类别的项目才会计数为 2。

If for some reason you have duplicate categories, you can use something like COUNT(DISTINCT category). COUNT(*)should work as well, and differs only if category can be null.

如果由于某种原因您有重复的类别,您可以使用类似COUNT(DISTINCT category). COUNT(*)应该也可以工作,并且仅当 category 可以为 null 时才不同。

回答by Chris

One other solution is, of course, something like "SELECT DISTINCT Project FROM ... AS a WHERE 'Apple' IN (SELECT Category FROM ... AS b WHERE a.Project = b.Project) AND 'Banana' IN (SELECT Category FROM ... AS b WHERE a.Project = b.Project)", but that gets pretty computationally expensive pretty quickly. I was hoping for something more elegant, and you guys haven't disappointed. I'm including this one mostly for completeness in case someone else consults this question. It's clearly worth zero points. :)

当然,另一种解决方案是“SELECT DISTINCT Project FROM ... AS a WHERE 'Apple' IN (SELECT Category FROM ... AS b WHERE a.Project = b.Project) AND 'Banana' IN (SELECT Category FROM ... AS b WHERE a.Project = b.Project)”,但这很快就会在计算上变得非常昂贵。我希望有更优雅的东西,你们并没有失望。我包括这个主要是为了完整性,以防其他人咨询这个问题。这显然值得零分。:)