MySQL:总 GROUP BY WITH ROLLUP 好奇心

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/659845/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:57:02  来源:igfitidea点击:

MySQL: Total GROUP BY WITH ROLLUP curiosity

mysql

提问by Svish

I have two queries. One of them makes sense to me, the other don't. First one:

我有两个疑问。其中一个对我有意义,另一个没有。第一:

SELECT gender AS 'Gender', count(*) AS '#'
    FROM registrations 
    GROUP BY gender WITH ROLLUP

That gives me this:

这给了我这个:

Gender       #
Female      20
Male        19
NULL        39

So, I get the count, and the total count. What I expected. Next one:

所以,我得到了计数和总数。我所期待的。下一个:

SELECT c.printable_name AS 'Country', count(*) AS '#' 
    FROM registrations r 
    INNER JOIN country c ON r.country = c.country_id 
    GROUP BY country WITH ROLLUP

Country         #
Denmark         9
Norway         10
Sweden         18
United States   1
Uzbekistan      1
Uzbekistan     39

Same result. But why do I get Uzbekistan for the total??

结果一样。但为什么我得到乌兹别克斯坦的总数?

回答by bobince

But why do I get Uzbekistan for the total??

但为什么我得到乌兹别克斯坦的总数?

Because you're not SELECTing the item that you're GROUPing BY. If you said:

因为您没有选择要分组的项目。如果你说:

GROUP BY c.printable_name

You'd get the expected NULL. However you're grouping by a different column so MySQL doesn't know that printable_name is taking part in a rollup-group, and selects any old value from that column, in the join of allregistrations. (So it is possible you will see other countries than Uzbekistan.)

你会得到预期的 NULL。但是,您按不同的列分组,因此 MySQL 不知道 printable_name 正在参与汇总组,并在所有注册的联接中从该列中选择任何旧值。(因此,您可能会看到除乌兹别克斯坦以外的其他国家/地区。)

This is part of a wider problem with MySQL being permissive on what you can SELECT in a GROUP BY query. For example, you can say:

这是 MySQL 允许在 GROUP BY 查询中选择什么的更广泛问题的一部分。例如,你可以说:

SELECT gender FROM registrations GROUP BY country;

and MySQL will happily pick one of the gender values for a registration from each country, even though there is no direct causal link (aka “functional dependency”) between country and gender. Other DBMSs will refuse the above command on the grounds that there isn't guaranteed to be one gender per country.(*)

并且 MySQL 会很乐意从每个国家/地区为注册选择一个性别值,即使国家和性别之间没有直接的因果关系(也称为“功能依赖”)。其他 DBMS 将拒绝上述命令,理由是不能保证每个国家只有一种性别。(*)

Now, this:

现在,这个:

SELECT c.printable_name AS 'Country', count(*) AS '#' 
FROM registrations r 
INNER JOIN country c ON r.country = c.country_id 
GROUP BY country

is OK, because there's a functional dependency between r.country and c.printable_name (assuming you have correctly described your country_id as a PRIMARY KEY).

没问题,因为 r.country 和 c.printable_name 之间存在函数依赖关系(假设您已将 country_id 正确描述为 PRIMARY KEY)。

However MySQL's WITH ROLLUP extension is a bit of a hack in the way it works. On the rollup row stage at the end, it runs over the entire pre-grouping result set to grab its values, and thensets the group-by column to NULL. It doesn't also null other columns that have a functional dependency on that column.It probably should, but MySQL currently doesn't really understand the whole thing about functional dependencies.

然而,MySQL 的 WITH ROLLUP 扩展在其工作方式上有点小问题。在最后的汇总行阶段,它遍历整个预分组结果集以获取其值,然后将分组依据列设置为 NULL。它也不会将对该列具有功能依赖性的其他列归零。它可能应该,但 MySQL 目前并没有真正理解关于函数依赖的整个事情。

So if you select c.printable_name it will show you whichever country name value it randomly picked, and if you select c.country_id it will show you whichever country ID it randomly picked?—?even though c.country_id is the join criterion, so must be the same as r.country, which is NULL!

因此,如果您选择 c.printable_name 它将显示它随机选择的国家名称值,如果您选择 c.country_id 它将显示它随机选择的国家 ID?-?即使 c.country_id 是加入标准,所以必须与 r.country 相同,即 NULL!

What you can do to work around the problem is:

您可以做些什么来解决这个问题:

  • group by printable_name instead; should be OK if printable_names are unique, or
  • select “r.country” as well as printable_name, and check that for being NULL, or
  • forget WITH ROLLUP and do a separate query for the end sum. This will be a little slower but it will also be ANSI SQL-92 compliant so your app could work on other databases.
  • 改为按 printable_name 分组;如果printable_names 是唯一的,应该没问题,或者
  • 选择“r.country”以及printable_name,并检查是否为NULL,或者
  • 忘记 WITH ROLLUP 并为最终总和做一个单独的查询。这会慢一点,但它也符合 ANSI SQL-92,因此您的应用程序可以在其他数据库上运行。

(*: MySQL has an SQL_MODE option ONLY_FULL_GROUP_BYthat is supposed to address this issue, but it goes much too far and only lets you select columns from the GROUP BY, not columns that have a functional dependency on the GROUP BY. So it will make valid queries fail as well, making it generally useless.)

(*: MySQL 有一个 SQL_MODE 选项ONLY_FULL_GROUP_BY应该解决这个问题,但它走得太远了,只允许你从 GROUP BY 中选择列,而不是对 GROUP BY 有功能依赖的列。所以它会让有效的查询也会失败,使其通常无用。)

回答by Bajlo

Coz when you use JOIN method, the following NULL element of array will have value of previous NOT NULL element. But Im not sure. Thats my experience when I use it in PHP.

因为当您使用 JOIN 方法时,数组的以下 NULL 元素将具有前一个 NOT NULL 元素的值。但我不确定。这就是我在 PHP 中使用它时的经验。

hm... there is another problem... 'Country' canot be because it is name of table. So change for something else. Then last result will display NULL. Here is my proposal:

嗯...还有另一个问题... 'Country' 不能是因为它是表的名称。所以换别的。那么最后的结果将显示NULL。这是我的建议:

$result = mysql_query("SELECT c.printable_name AS 'countryp', count(*) AS '#'
FROM registrations r, country c WHERE r.country = c.country_id
GROUP BY countryp WITH ROLLUP");

while($row = @mysql_fetch_array($result)) {
  $r1 = $row["countryp"];
  $r2 = $row["#"];
  if ($r1 == NULL) $r1 = 'Total';
  echo "$r1 $r2<br />";
}

回答by hdk

SELECT ifnull(c.printable_name, "Total Registration = ") AS 'Country', count(*) AS '#' 
FROM registrations r 
INNER JOIN country c ON r.country = c.country_id 
GROUP BY country WITH ROLLUP;

This would print ‘Total Registration = 39', and would be the last row/record.

这将打印 ' Total Registration = 39',并且将是最后一行/记录。