MySQL 带有聚合和内部连接的 GROUP BY

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6611483/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 20:24:51  来源:igfitidea点击:

GROUP BY with aggregate and an INNER JOIN

mysqlsql

提问by AudioDroid

I tried to narrow down the problem as much as possible, it is still quite something. This is the query that doesn't work the way I want it:

我试图尽可能缩小问题的范围,它仍然很重要。这是不按我想要的方式工作的查询:

SELECT *, MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
  (SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
  FROM tbl_edges edges1
  INNER JOIN tbl_edges edges2
  ON edges1.nodeB = edges2.nodeA
  GROUP BY edges1.id HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY id_edge

Here is what I get:

这是我得到的:

|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|2  | 23   | 2    | 23  | 35  | 1       | 9                      |
|4  | 24   | 5    | 24  | 46  | 1       | 9                      |
------------------------------------------------------------------

and this is what I would want:

这就是我想要的:

|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|3  | 23   | 9    | 23  | 35  | 1       | 9                      |
|5  | 24   | 9    | 24  | 46  | 1       | 9                      |
------------------------------------------------------------------

But let me elaborate a bit...

但是让我详细说明一下......

I have a graph, let's say as such:

我有一个图表,让我们这样说:

    node1
      |
    node2
   /     \
node3    node4
  |       |
node5    node6

Therefore I have a table I call tbl_edges like this:

因此,我有一张我称之为 tbl_edges 的表,如下所示:

| id  | nodeA | node B |
------------------------
| 12  |   1   |    2   |
| 23  |   2   |    3   |
| 24  |   2   |    4   |
| 35  |   3   |    5   |
| 46  |   4   |    6   |
------------------------

Now each edgehas "stop_overs" at a certain distance (to nodeA). Therefore I have a table tbl_stopover like this:

现在每个edge都有“ stop_overs”在一定距离(到nodeA)。因此我有一个像这样的表 tbl_stopover:

| id  | edge  |  dist  |
------------------------
|  1  |  12   |    5   |
|  2  |  23   |    2   |
|  3  |  23   |    9   |
|  4  |  24   |    5   |
|  5  |  24   |    9   |
|  6  |  35   |    5   |
|  7  |  46   |    5   |
------------------------

Why this query?
Let's assume I want to calculate the distance between the stop_overs. Withinone edge that is no problem. Acrossedges it gets more difficult. But if I have two edges that are connected and there is no other connection I can also calculate the distance. Here an example assuming all edges have a lengthof 10. :

edge23has a stop_over(id=3) at dist=9, edge35has a stop_over(id=6) at dist=5. Therefore the distance between these two stop_overs is:

为什么这个查询?
假设我想计算stop_overs之间的距离。一个边缘是没有问题的。越过边缘变得更加困难。但是如果我有两条连接的边并且没有其他连接,我也可以计算距离。此处作为一例假设所有边缘具有length的10:

边缘23具有stop_over(ID =3在DIST = 9),边缘35具有stop_over(ID =6在DIST = 5)。因此这两个stop_overs之间的距离是:

dist = (length - dist_id3) + dist_id5 = (10-9) + 5

I am not sure if I made my self clear. If this is not understandable, feel free to ask question and I will do my best to make this more understandable.

我不确定我有没有说清楚。如果这不能理解,请随时提出问题,我会尽力使这更易于理解。

回答by JNK

MySQL allows you to do something silly - display fields in an aggregate query that are not a part of the GROUP BYor an aggregate function like MAX. When you do this, you get random (as you said) results for the remaining fields.

MySQL 允许你做一些愚蠢的事情——在聚合查询中显示不属于GROUP BY或聚合函数的字段,如MAX. 当你这样做时,你会得到剩余字段的随机结果(如你所说)。

In your query you are doing this twice- once in your inner query (id2is not part of a GROUP BYor aggregate) and once in the outer.

在您的查询中,您将执行两次此操作- 一次在内部查询中(id2不是 aGROUP BY或聚合的一部分),一次在外部查询中。

Prepare for random results!

为随机结果做好准备!

To fix it, try something like this:

要修复它,请尝试以下操作:

SELECT tbl_stopover.id,
       tbl_stopover.dist,
       tbl_conn.id1,
       tbl_conn.id2,
       tbl_conn.numConn,
       MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
  (SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
  FROM tbl_edges edges1
  INNER JOIN tbl_edges edges2
  ON edges1.nodeB = edges2.nodeA
  GROUP BY edges1.id, edges2.id
  HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.id,
         tbl_stopover.dist,
         tbl_conn.id1,
         tbl_conn.id2,
         tbl_conn.numConn

The major changes are the explicit field list (note that I removed the id_edgesince you are joining on id1and already have that field), and addition of additional fields to both the inner and outer GROUP BYclauses.

主要的变化是显式字段列表(请注意,我删除了id_edge因为您加入id1并且已经拥有该字段),以及向内部和外部GROUP BY子句添加了额外的字段。

If this gives you more rows than you want then you may need to explain more about your desired result set. Something like this is the only way to ensure you get appropriate groupings.

如果这为您提供了比您想要的更多的行,那么您可能需要更多地解释您想要的结果集。这样的事情是确保您获得适当分组的唯一方法。

回答by AudioDroid

Okay. This seemsto be the answer to my question. I will do some further "investigation" though, because I'm not sure if this is reliable. If anybody has some though on this, please leave a comment.

好的。这似乎是我问题的答案。不过,我会做一些进一步的“调查”,因为我不确定这是否可靠。如果有人对此有一些看法,请发表评论。

SELECT tbl.id, tbl.dist, tbl.id1, tbl.id2, MAX(dist) maxDist
FROM
(
  SELECT tbl_stopover.id,
         tbl_stopover.dist,
         tbl_conn.id1,
         tbl_conn.id2,
         tbl_conn.numConn
  FROM tbl_stopover
  INNER JOIN
    (SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
    FROM tbl_edges edges1
    INNER JOIN tbl_edges edges2
    ON edges1.nodeB = edges2.nodeA
    GROUP BY edges1.id
    HAVING numConn = 1) AS tbl_conn
  ON tbl_stopover.id_edge = tbl_conn.id1
  GROUP BY tbl_stopover.dist, tbl_conn.id1
  ORDER BY dist DESC) AS tbl
GROUP BY tbl.id1, tbl.id2

Thanks to JNK(my colleague at work) without whom I wouldn't have gotten this far.

多亏了JNK(我的同事)没有他,我就不会走到这一步。