获取 SQL 中另一列的每个值的最常见值

Question

提问by Martin C. Martin

I have a table like this:

我有一张这样的表：

 Column  | Type | Modifiers 
---------+------+-----------
 country | text | 
 food_id | int  | 
 eaten   | date |

And for each country, I want to get the food that is eaten most often. The best I can think of (I'm using postgres) is:

对于每个国家，我都想得到最常吃的食物。我能想到的最好的（我正在使用 postgres）是：

CREATE TEMP TABLE counts AS 
   SELECT country, food_id, count(*) as count FROM munch GROUP BY country, food_id;

CREATE TEMP TABLE max_counts AS 
   SELECT country, max(count) as max_count FROM counts GROUP BY country;

SELECT country, max(food_id) FROM counts 
   WHERE (country, count) IN (SELECT * from max_counts) GROUP BY country;

In that last statement, the GROUP BY and max() are needed to break ties, where two different foods have the same count.

在最后一条语句中，需要使用 GROUP BY 和 max() 来打破平局，其中两种不同的食物具有相同的数量。

This seems like a lot of work for something conceptually simple. Is there a more straight forward way to do it?

对于概念上简单的东西来说，这似乎是很多工作。有没有更直接的方法来做到这一点？

Answer 1

回答by pilcrow

PostgreSQL introduced support for window functionsin 8.4, the year after this question was asked. It's worth noting that it might be solved today as follows:

PostgreSQL在 8.4 中引入了对窗口函数的支持，也就是提出这个问题的后一年。值得注意的是，今天可能会解决如下：

SELECT country, food_id
  FROM (SELECT country, food_id, ROW_NUMBER() OVER (PARTITION BY country ORDER BY freq DESC) AS rn
          FROM (  SELECT country, food_id, COUNT('x') AS freq
                    FROM country_foods
                GROUP BY 1, 2) food_freq) ranked_food_req
 WHERE rn = 1;

The above will break ties. If you don't want to break ties, you could use DENSE_RANK() instead.

以上将打破联系。如果您不想打破平局，则可以改用 DENSE_RANK()。

Answer 2

回答by jrouquie

It is now even simpler: PostgreSQL 9.4 introduced the mode()function:

现在更简单了：PostgreSQL 9.4 引入了这个mode()函数：

select mode() within group (order by food_id)
from munch
group by country

returns (like user2247323's example):

返回（如 user2247323 的示例）：

country | mode
--------------
GB      | 3
US      | 1

See documentation here: https://wiki.postgresql.org/wiki/Aggregate_Mode

请参阅此处的文档：https: //wiki.postgresql.org/wiki/Aggregate_Mode

https://www.postgresql.org/docs/current/static/functions-aggregate.html#FUNCTIONS-ORDEREDSET-TABLE

Answer 3

回答by jkramer

SELECT DISTINCT
"F1"."food",
"F1"."country"
FROM "foo" "F1"
WHERE
"F1"."food" =
    (SELECT "food" FROM
        (
            SELECT "food", COUNT(*) AS "count"
            FROM "foo" "F2" 
            WHERE "F2"."country" = "F1"."country" 
            GROUP BY "F2"."food" 
            ORDER BY "count" DESC
        ) AS "F5"
        LIMIT 1
    )

Well, I wrote this in a hurry and didn't check it really well. The sub-select might be pretty slow, but this is shortest and most simple SQL statement that I could think of. I'll probably tell more when I'm less drunk.

好吧，我匆忙写了这个，并没有很好地检查它。子选择可能很慢，但这是我能想到的最短、最简单的 SQL 语句。当我不那么醉时，我可能会说更多。

PS: Oh well, "foo" is the name of my table, "food" contains the name of the food and "country" the name of the country. Sample output:

PS：哦好吧，“foo”是我的表名，“food”包含食物名称，“country”包含国家名称。示例输出：

   food    |  country   
-----------+------------
 Bratwurst | Germany
 Fisch     | Frankreich

Answer 4

回答by Jamal Hansen

try this:

尝试这个：

Select Country, Food_id
From Munch T1
Where Food_id= 
    (Select Food_id
     from Munch T2
     where T1.Country= T2.Country
     group by Food_id
     order by count(Food_id) desc
      limit 1)
group by Country, Food_id

Answer 5

回答by JCF

Here is a statement which I believe gives you what you want and is simple and concise:

这是一个声明，我相信它可以满足您的需求，并且简单明了：

select distinct on (country) country, food_id
from munch
group by country, food_id
order by country, count(*) desc

Please let me know what you think.

请让我知道你的想法。

BTW, the distinct onfeature is only available in Postgres.

顺便说一句，独特的功能仅在 Postgres 中可用。

Example, source data:

示例，源数据：

country | food_id | eaten
US        1         2017-1-1
US        1         2017-1-1
US        2         2017-1-1
US        3         2017-1-1
GB        3         2017-1-1
GB        3         2017-1-1
GB        2         2017-1-1

output:

输出：

country | food_id
US        1
GB        3

Answer 6

回答by Matt Rogish

SELECT country, MAX( food_id )
  FROM( SELECT m1.country, m1.food_id
          FROM munch m1
         INNER JOIN ( SELECT country
                           , food_id
                           , COUNT(*) as food_counts
                        FROM munch m2
                    GROUP BY country, food_id ) as m3
                 ON m1.country = m3.country
         GROUP BY m1.country, m1.food_id 
        HAVING COUNT(*) / COUNT(DISTINCT m3.food_id) = MAX(food_counts) ) AS max_foods
  GROUP BY country

I don't like the MAX(.) GROUP BY to break ties... There's gotta be a way to incorporate eaten date into the JOIN in some way to arbitrarily select the most recent one...

我不喜欢 MAX(.) GROUP BY 打破关系......必须有一种方法以某种方式将吃过的日期合并到 JOIN 中以任意选择最新的......

I'm interested on the query plan for this thing if you run it on your live data!

如果您在实时数据上运行它，我对这件事的查询计划很感兴趣！

Answer 7

回答by Theo

select country,food_id, count(*) ne  
from   food f1  
group by country,food_id    
having count(*) = (select max(count(*))  
                   from   food f2  
                   where  country = f1.country  
                   group by food_id)

Answer 8

回答by John MacIntyre

Try something like this

尝试这样的事情

select country, food_id, count(*) cnt 
into #tempTbl 
from mytable 
group by country, food_id

select country, food_id
from  #tempTbl as x
where cnt = 
  (select max(cnt) 
  from mytable 
  where country=x.country 
  and food_id=x.food_id)

This could be put all into a single select, but I don't have time to muck around with it right now.

这可以全部放入一个选择中，但我现在没有时间处理它。

Good luck.

祝你好运。

Answer 9

回答by JosephStyons

Here's how to do it without any temp tables:

这是没有任何临时表的方法：

Edit: simplified

编辑：简化

select nf.country, nf.food_id as most_frequent_food_id
from national_foods nf
group by country, food_id 
having
  (country,count(*)) in (  
                        select country, max(cnt)
                        from
                          (
                          select country, food_id, count(*) as cnt
                          from national_foods nf1
                          group by country, food_id
                          )
                        group by country
                        having country = nf.country
                        )

获取 SQL 中另一列的每个值的最常见值

提问by Martin C. Martin

回答by pilcrow

回答by jrouquie

回答by jkramer

回答by Jamal Hansen

回答by JCF

回答by Matt Rogish

回答by Theo

回答by John MacIntyre

回答by JosephStyons

相关推荐

最近更新

标签

获取 SQL 中另一列的每个值的最常见值

提问by Martin C. Martin

回答by pilcrow

回答by jrouquie

回答by jkramer

回答by Jamal Hansen

回答by JCF

回答by Matt Rogish

回答by Theo

回答by John MacIntyre

回答by JosephStyons

相关推荐

SQL Oracle查询以秒为单位的时差

SQL 数据库和函数式编程不一致吗？

SQL Server 2008 中的逗号分割函数

SQL 语句帮助 - 为每个客户选择最新的订单

相关推荐

最近更新

标签