MySQL 带有聚合和内部连接的 GROUP BY
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6611483/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GROUP BY with aggregate and an INNER JOIN
提问by AudioDroid
I tried to narrow down the problem as much as possible, it is still quite something. This is the query that doesn't work the way I want it:
我试图尽可能缩小问题的范围,它仍然很重要。这是不按我想要的方式工作的查询:
SELECT *, MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY id_edge
Here is what I get:
这是我得到的:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|2 | 23 | 2 | 23 | 35 | 1 | 9 |
|4 | 24 | 5 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
and this is what I would want:
这就是我想要的:
|id | edge | dist | id1 | id2 | numConn | MAX(tbl_stopover.dist) |
------------------------------------------------------------------
|3 | 23 | 9 | 23 | 35 | 1 | 9 |
|5 | 24 | 9 | 24 | 46 | 1 | 9 |
------------------------------------------------------------------
But let me elaborate a bit...
但是让我详细说明一下......
I have a graph, let's say as such:
我有一个图表,让我们这样说:
node1
|
node2
/ \
node3 node4
| |
node5 node6
Therefore I have a table I call tbl_edges like this:
因此,我有一张我称之为 tbl_edges 的表,如下所示:
| id | nodeA | node B |
------------------------
| 12 | 1 | 2 |
| 23 | 2 | 3 |
| 24 | 2 | 4 |
| 35 | 3 | 5 |
| 46 | 4 | 6 |
------------------------
Now each edge
has "stop_over
s" at a certain distance (to nodeA
). Therefore I have a table tbl_stopover like this:
现在每个edge
都有“ stop_over
s”在一定距离(到nodeA
)。因此我有一个像这样的表 tbl_stopover:
| id | edge | dist |
------------------------
| 1 | 12 | 5 |
| 2 | 23 | 2 |
| 3 | 23 | 9 |
| 4 | 24 | 5 |
| 5 | 24 | 9 |
| 6 | 35 | 5 |
| 7 | 46 | 5 |
------------------------
Why this query?
Let's assume I want to calculate the distance between the stop_over
s. Withinone edge that is no problem. Acrossedges it gets more difficult. But if I have two edges that are connected and there is no other connection I can also calculate the distance. Here an example assuming all edges have a length
of 10. :
edge23
has a stop_over(id=3
) at dist=9, edge35
has a stop_over(id=6
) at dist=5. Therefore the distance between these two stop_over
s is:
为什么这个查询?
假设我想计算stop_over
s之间的距离。在一个边缘内是没有问题的。越过边缘变得更加困难。但是如果我有两条连接的边并且没有其他连接,我也可以计算距离。此处作为一例假设所有边缘具有length
的10:
边缘23
具有stop_over(ID =3
在DIST = 9),边缘35
具有stop_over(ID =6
在DIST = 5)。因此这两个stop_over
s之间的距离是:
dist = (length - dist_id3) + dist_id5 = (10-9) + 5
I am not sure if I made my self clear. If this is not understandable, feel free to ask question and I will do my best to make this more understandable.
我不确定我有没有说清楚。如果这不能理解,请随时提出问题,我会尽力使这更易于理解。
回答by JNK
MySQL allows you to do something silly - display fields in an aggregate query that are not a part of the GROUP BY
or an aggregate function like MAX
. When you do this, you get random (as you said) results for the remaining fields.
MySQL 允许你做一些愚蠢的事情——在聚合查询中显示不属于GROUP BY
或聚合函数的字段,如MAX
. 当你这样做时,你会得到剩余字段的随机结果(如你所说)。
In your query you are doing this twice- once in your inner query (id2
is not part of a GROUP BY
or aggregate) and once in the outer.
在您的查询中,您将执行两次此操作- 一次在内部查询中(id2
不是 aGROUP BY
或聚合的一部分),一次在外部查询中。
Prepare for random results!
为随机结果做好准备!
To fix it, try something like this:
要修复它,请尝试以下操作:
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn,
MAX(tbl_stopover.dist)
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id, edges2.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
The major changes are the explicit field list (note that I removed the id_edge
since you are joining on id1
and already have that field), and addition of additional fields to both the inner and outer GROUP BY
clauses.
主要的变化是显式字段列表(请注意,我删除了id_edge
因为您加入id1
并且已经拥有该字段),以及向内部和外部GROUP BY
子句添加了额外的字段。
If this gives you more rows than you want then you may need to explain more about your desired result set. Something like this is the only way to ensure you get appropriate groupings.
如果这为您提供了比您想要的更多的行,那么您可能需要更多地解释您想要的结果集。这样的事情是确保您获得适当分组的唯一方法。
回答by AudioDroid
Okay. This seemsto be the answer to my question. I will do some further "investigation" though, because I'm not sure if this is reliable. If anybody has some though on this, please leave a comment.
好的。这似乎是我问题的答案。不过,我会做一些进一步的“调查”,因为我不确定这是否可靠。如果有人对此有一些看法,请发表评论。
SELECT tbl.id, tbl.dist, tbl.id1, tbl.id2, MAX(dist) maxDist
FROM
(
SELECT tbl_stopover.id,
tbl_stopover.dist,
tbl_conn.id1,
tbl_conn.id2,
tbl_conn.numConn
FROM tbl_stopover
INNER JOIN
(SELECT edges1.id id1, edges2.id id2, COUNT(edges1.id) numConn
FROM tbl_edges edges1
INNER JOIN tbl_edges edges2
ON edges1.nodeB = edges2.nodeA
GROUP BY edges1.id
HAVING numConn = 1) AS tbl_conn
ON tbl_stopover.id_edge = tbl_conn.id1
GROUP BY tbl_stopover.dist, tbl_conn.id1
ORDER BY dist DESC) AS tbl
GROUP BY tbl.id1, tbl.id2
Thanks to JNK
(my colleague at work) without whom I wouldn't have gotten this far.
多亏了JNK
(我的同事)没有他,我就不会走到这一步。