检索每组中的最后一条记录 - MySQL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1313120/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Retrieving the last record in each group - MySQL
提问by Vijay Dev
There is a table messages
that contains data as shown below:
有一个messages
包含数据的表,如下所示:
Id Name Other_Columns
-------------------------
1 A A_data_1
2 A A_data_2
3 A A_data_3
4 B B_data_1
5 B B_data_2
6 C C_data_1
If I run a query select * from messages group by name
, I will get the result as:
如果我运行一个查询select * from messages group by name
,我会得到如下结果:
1 A A_data_1
4 B B_data_1
6 C C_data_1
What query will return the following result?
什么查询会返回以下结果?
3 A A_data_3
5 B B_data_2
6 C C_data_1
That is, the last record in each group should be returned.
也就是说,应该返回每个组中的最后一条记录。
At present, this is the query that I use:
目前,这是我使用的查询:
SELECT
*
FROM (SELECT
*
FROM messages
ORDER BY id DESC) AS x
GROUP BY name
But this looks highly inefficient. Any other ways to achieve the same result?
但这看起来非常低效。还有其他方法可以达到相同的结果吗?
回答by Bill Karwin
MySQL 8.0 now supports windowing functions, like almost all popular SQL implementations. With this standard syntax, we can write greatest-n-per-group queries:
MySQL 8.0 现在支持窗口函数,就像几乎所有流行的 SQL 实现一样。使用这个标准语法,我们可以编写最大 n-per-group 查询:
WITH ranked_messages AS (
SELECT m.*, ROW_NUMBER() OVER (PARTITION BY name ORDER BY id DESC) AS rn
FROM messages AS m
)
SELECT * FROM ranked_messages WHERE rn = 1;
Below is the original answer I wrote for this question in 2009:
以下是我在 2009 年为这个问题写的原始答案:
I write the solution this way:
我这样写解决方案:
SELECT m1.*
FROM messages m1 LEFT JOIN messages m2
ON (m1.name = m2.name AND m1.id < m2.id)
WHERE m2.id IS NULL;
Regarding performance, one solution or the other can be better, depending on the nature of your data. So you should test both queries and use the one that is better at performance given your database.
关于性能,一种或另一种解决方案可能更好,具体取决于数据的性质。因此,您应该测试这两个查询并使用给定数据库性能更好的查询。
For example, I have a copy of the StackOverflow August data dump. I'll use that for benchmarking. There are 1,114,357 rows in the Posts
table. This is running on MySQL5.0.75 on my Macbook Pro 2.40GHz.
例如,我有StackOverflow August 数据转储的副本。我将使用它进行基准测试。表中有 1,114,357 行Posts
。这是在我的 Macbook Pro 2.40GHz上的MySQL5.0.75 上运行的。
I'll write a query to find the most recent post for a given user ID (mine).
我将编写一个查询来查找给定用户 ID(我的)的最新帖子。
First using the technique shownby @Eric with the GROUP BY
in a subquery:
首先在子查询中使用@Eric展示的技术GROUP BY
:
SELECT p1.postid
FROM Posts p1
INNER JOIN (SELECT pi.owneruserid, MAX(pi.postid) AS maxpostid
FROM Posts pi GROUP BY pi.owneruserid) p2
ON (p1.postid = p2.maxpostid)
WHERE p1.owneruserid = 20860;
1 row in set (1 min 17.89 sec)
Even the EXPLAIN
analysistakes over 16 seconds:
甚至EXPLAIN
分析也需要超过 16 秒:
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 76756 | |
| 1 | PRIMARY | p1 | eq_ref | PRIMARY,PostId,OwnerUserId | PRIMARY | 8 | p2.maxpostid | 1 | Using where |
| 2 | DERIVED | pi | index | NULL | OwnerUserId | 8 | NULL | 1151268 | Using index |
+----+-------------+------------+--------+----------------------------+-------------+---------+--------------+---------+-------------+
3 rows in set (16.09 sec)
Now produce the same query result using my techniquewith LEFT JOIN
:
现在用产生同样的查询结果我的技术有LEFT JOIN
:
SELECT p1.postid
FROM Posts p1 LEFT JOIN posts p2
ON (p1.owneruserid = p2.owneruserid AND p1.postid < p2.postid)
WHERE p2.postid IS NULL AND p1.owneruserid = 20860;
1 row in set (0.28 sec)
The EXPLAIN
analysis shows that both tables are able to use their indexes:
该EXPLAIN
分析表明,这两个表都能够使用他们的指标:
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
| 1 | SIMPLE | p1 | ref | OwnerUserId | OwnerUserId | 8 | const | 1384 | Using index |
| 1 | SIMPLE | p2 | ref | PRIMARY,PostId,OwnerUserId | OwnerUserId | 8 | const | 1384 | Using where; Using index; Not exists |
+----+-------------+-------+------+----------------------------+-------------+---------+-------+------+--------------------------------------+
2 rows in set (0.00 sec)
Here's the DDL for my Posts
table:
这是我的Posts
表的 DDL :
CREATE TABLE `posts` (
`PostId` bigint(20) unsigned NOT NULL auto_increment,
`PostTypeId` bigint(20) unsigned NOT NULL,
`AcceptedAnswerId` bigint(20) unsigned default NULL,
`ParentId` bigint(20) unsigned default NULL,
`CreationDate` datetime NOT NULL,
`Score` int(11) NOT NULL default '0',
`ViewCount` int(11) NOT NULL default '0',
`Body` text NOT NULL,
`OwnerUserId` bigint(20) unsigned NOT NULL,
`OwnerDisplayName` varchar(40) default NULL,
`LastEditorUserId` bigint(20) unsigned default NULL,
`LastEditDate` datetime default NULL,
`LastActivityDate` datetime default NULL,
`Title` varchar(250) NOT NULL default '',
`Tags` varchar(150) NOT NULL default '',
`AnswerCount` int(11) NOT NULL default '0',
`CommentCount` int(11) NOT NULL default '0',
`FavoriteCount` int(11) NOT NULL default '0',
`ClosedDate` datetime default NULL,
PRIMARY KEY (`PostId`),
UNIQUE KEY `PostId` (`PostId`),
KEY `PostTypeId` (`PostTypeId`),
KEY `AcceptedAnswerId` (`AcceptedAnswerId`),
KEY `OwnerUserId` (`OwnerUserId`),
KEY `LastEditorUserId` (`LastEditorUserId`),
KEY `ParentId` (`ParentId`),
CONSTRAINT `posts_ibfk_1` FOREIGN KEY (`PostTypeId`) REFERENCES `posttypes` (`PostTypeId`)
) ENGINE=InnoDB;
回答by newtover
UPD: 2017-03-31, the version 5.7.5of MySQL made the ONLY_FULL_GROUP_BY switch enabled by default (hence, non-deterministic GROUP BY queries became disabled). Moreover, they updated the GROUP BY implementation and the solution might not work as expected anymore even with the disabled switch. One needs to check.
UPD:2017-03-31,MySQL 5.7.5版本默认启用 ONLY_FULL_GROUP_BY 开关(因此,非确定性 GROUP BY 查询被禁用)。此外,他们更新了 GROUP BY 实现,即使禁用了开关,该解决方案也可能不再按预期工作。一个需要检查。
Bill Karwin's solution above works fine when item count within groups is rather small, but the performance of the query becomes bad when the groups are rather large, since the solution requires about n*n/2 + n/2
of only IS NULL
comparisons.
当组内的项目计数相当小时,Bill Karwin 的上述解决方案工作正常,但当组相当大时,查询的性能会变得很差,因为该解决方案n*n/2 + n/2
只需要大约1 次IS NULL
比较。
I made my tests on a InnoDB table of 18684446
rows with 1182
groups. The table contains testresults for functional tests and has the (test_id, request_id)
as the primary key. Thus, test_id
is a group and I was searching for the last request_id
for each test_id
.
我在18684446
带有1182
组的行的 InnoDB 表上进行了测试。该表包含功能测试的测试结果,并具有(test_id, request_id)
作为主键。因此,test_id
是一个组,我正在request_id
为每个test_id
.
Bill's solution has already been running for several hours on my dell e4310 and I do not know when it is going to finish even though it operates on a coverage index (hence using index
in EXPLAIN).
Bill 的解决方案已经在我的戴尔 e4310 上运行了几个小时,我不知道它什么时候完成,即使它在覆盖索引上运行(因此using index
在 EXPLAIN 中)。
I have a couple of other solutions that are based on the same ideas:
我还有其他一些基于相同想法的解决方案:
- if the underlying index is BTREE index (which is usually the case), the largest
(group_id, item_value)
pair is the last value within eachgroup_id
, that is the first for eachgroup_id
if we walk through the index in descending order; - if we read the values which are covered by an index, the values are read in the order of the index;
- each index implicitly contains primary key columns appended to that (that is the primary key is in the coverage index). In solutions below I operate directly on the primary key, in you case, you will just need to add primary key columns in the result.
- in many cases it is much cheaper to collect the required row ids in the required order in a subquery and join the result of the subquery on the id. Since for each row in the subquery result MySQL will need a single fetch based on primary key, the subquery will be put first in the join and the rows will be output in the order of the ids in the subquery (if we omit explicit ORDER BY for the join)
- 如果基础索引是 BTREE 索引(通常是这种情况),则最大的
(group_id, item_value)
对是 each 中的最后一个值group_id
,group_id
如果我们按降序遍历索引,则是each 的第一个值; - 如果我们读取索引覆盖的值,则按照索引的顺序读取值;
- 每个索引都隐含地包含附加到它的主键列(即主键在覆盖索引中)。在下面的解决方案中,我直接对主键进行操作,在您的情况下,您只需要在结果中添加主键列。
- 在许多情况下,在子查询中以所需的顺序收集所需的行 id 并将子查询的结果连接到 id 上要便宜得多。由于对于子查询结果中的每一行,MySQL 都需要基于主键进行一次 fetch,因此子查询将首先放在 join 中,并且行将按照子查询中 id 的顺序输出(如果我们省略显式 ORDER BY加入)
3 ways MySQL uses indexesis a great article to understand some details.
MySQL 使用索引的 3 种方式是一篇很好的文章,可以了解一些细节。
Solution 1
解决方案1
This one is incredibly fast, it takes about 0,8 secs on my 18M+ rows:
这个速度非常快,在我的 18M+ 行上大约需要 0.8 秒:
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC;
If you want to change the order to ASC, put it in a subquery, return the ids only and use that as the subquery to join to the rest of the columns:
如果要将顺序更改为 ASC,请将其放在子查询中,仅返回 id 并将其用作子查询以连接到其余列:
SELECT test_id, request_id
FROM (
SELECT test_id, MAX(request_id) AS request_id
FROM testresults
GROUP BY test_id DESC) as ids
ORDER BY test_id;
This one takes about 1,2 secs on my data.
这个对我的数据大约需要 1.2 秒。
Solution 2
解决方案2
Here is another solution that takes about 19 seconds for my table:
这是我的桌子需要大约 19 秒的另一个解决方案:
SELECT test_id, request_id
FROM testresults, (SELECT @group:=NULL) as init
WHERE IF(IFNULL(@group, -1)=@group:=test_id, 0, 1)
ORDER BY test_id DESC, request_id DESC
It returns tests in descending order as well. It is much slower since it does a full index scan but it is here to give you an idea how to output N max rows for each group.
它也按降序返回测试。它要慢得多,因为它进行了完整的索引扫描,但在这里可以让您了解如何为每个组输出最大 N 行。
The disadvantage of the query is that its result cannot be cached by the query cache.
查询的缺点是查询缓存无法缓存其结果。
回答by Eric
Use your subqueryto return the correct grouping, because you're halfway there.
使用您的子查询返回正确的分组,因为您已经完成了一半。
Try this:
尝试这个:
select
a.*
from
messages a
inner join
(select name, max(id) as maxid from messages group by name) as b on
a.id = b.maxid
If it's not id
you want the max of:
如果不是id
你想要的最大值:
select
a.*
from
messages a
inner join
(select name, max(other_col) as other_col
from messages group by name) as b on
a.name = b.name
and a.other_col = b.other_col
This way, you avoid correlated subqueries and/or ordering in your subqueries, which tend to be very slow/inefficient.
这样,您可以避免相关子查询和/或子查询中的排序,这往往非常缓慢/低效。
回答by JYelton
I arrived at a different solution, which is to get the IDs for the last post within each group, then select from the messages table using the result from the first query as the argument for a WHERE x IN
construct:
我找到了一个不同的解决方案,即获取每个组中最后一个帖子的 ID,然后使用第一个查询的结果作为WHERE x IN
构造参数从消息表中进行选择:
SELECT id, name, other_columns
FROM messages
WHERE id IN (
SELECT MAX(id)
FROM messages
GROUP BY name
);
I don't know how this performs compared to some of the other solutions, but it worked spectacularly for my table with 3+ million rows. (4 second execution with 1200+ results)
我不知道与其他一些解决方案相比,它的性能如何,但它对我的 3+ 百万行的表非常有效。(4 秒执行 1200 多个结果)
This should work both on MySQL and SQL Server.
这应该适用于 MySQL 和 SQL Server。
回答by Vipin
Solution by sub query fiddle Link
通过子查询小提琴链接解决方案
select * from messages where id in
(select max(id) from messages group by Name)
Solution By join condition fiddle link
解决方案通过加入条件小提琴链接
select m1.* from messages m1
left outer join messages m2
on ( m1.id<m2.id and m1.name=m2.name )
where m2.id is null
Reason for this post is to give fiddle link only. Same SQL is already provided in other answers.
这篇文章的原因是只提供小提琴链接。其他答案中已经提供了相同的 SQL。
回答by Song Zhengyi
An approach with considerable speed is as follows.
一种速度相当快的方法如下。
SELECT *
FROM messages a
WHERE Id = (SELECT MAX(Id) FROM messages WHERE a.Name = Name)
Result
结果
Id Name Other_Columns
3 A A_data_3
5 B B_data_2
6 C C_data_1
回答by Song Zhengyi
I've not yet tested with large DB but I think this could be faster than joining tables:
我还没有对大型数据库进行过测试,但我认为这可能比加入表更快:
SELECT *, Max(Id) FROM messages GROUP BY Name
回答by Steve Kass
Here are two suggestions. First, if mysql supports ROW_NUMBER(), it's very simple:
这里有两个建议。首先,如果mysql支持ROW_NUMBER(),很简单:
WITH Ranked AS (
SELECT Id, Name, OtherColumns,
ROW_NUMBER() OVER (
PARTITION BY Name
ORDER BY Id DESC
) AS rk
FROM messages
)
SELECT Id, Name, OtherColumns
FROM messages
WHERE rk = 1;
I'm assuming by "last" you mean last in Id order. If not, change the ORDER BY clause of the ROW_NUMBER() window accordingly. If ROW_NUMBER() isn't available, this is another solution:
我假设“最后”是指 Id 顺序中的最后一个。如果不是,则相应地更改 ROW_NUMBER() 窗口的 ORDER BY 子句。如果 ROW_NUMBER() 不可用,这是另一种解决方案:
Second, if it doesn't, this is often a good way to proceed:
其次,如果没有,这通常是一个好方法:
SELECT
Id, Name, OtherColumns
FROM messages
WHERE NOT EXISTS (
SELECT * FROM messages as M2
WHERE M2.Name = messages.Name
AND M2.Id > messages.Id
)
In other words, select messages where there is no later-Id message with the same Name.
换句话说,选择没有具有相同名称的 later-Id 消息的消息。
回答by Yoseph
Clearly there are lots of different ways of getting the same results, your question seems to be what is an efficient way of getting the last results in each group in MySQL. If you are working with huge amounts of data and assuming you are using InnoDB with even the latest versions of MySQL (such as 5.7.21 and 8.0.4-rc) then there might not be an efficient way of doing this.
显然有很多不同的方法可以获得相同的结果,您的问题似乎是在 MySQL 中获得每个组中最后一个结果的有效方法是什么。如果您正在处理大量数据并假设您将 InnoDB 与最新版本的 MySQL(例如 5.7.21 和 8.0.4-rc)一起使用,那么可能没有一种有效的方法来执行此操作。
We sometimes need to do this with tables with even more than 60 million rows.
我们有时需要对甚至超过 6000 万行的表执行此操作。
For these examples I will use data with only about 1.5 million rows where the queries would need to find results for all groups in the data. In our actual cases we would often need to return back data from about 2,000 groups (which hypothetically would not require examining very much of the data).
对于这些示例,我将使用只有大约 150 万行的数据,其中查询需要查找数据中所有组的结果。在我们的实际案例中,我们经常需要从大约 2,000 个组中返回数据(假设不需要检查很多数据)。
I will use the following tables:
我将使用以下表格:
CREATE TABLE temperature(
id INT UNSIGNED NOT NULL AUTO_INCREMENT,
groupID INT UNSIGNED NOT NULL,
recordedTimestamp TIMESTAMP NOT NULL,
recordedValue INT NOT NULL,
INDEX groupIndex(groupID, recordedTimestamp),
PRIMARY KEY (id)
);
CREATE TEMPORARY TABLE selected_group(id INT UNSIGNED NOT NULL, PRIMARY KEY(id));
The temperature table is populated with about 1.5 million random records, and with 100 different groups. The selected_group is populated with those 100 groups (in our cases this would normally be less than 20% for all of the groups).
温度表填充了大约 150 万条随机记录和 100 个不同的组。selected_group 填充了这 100 个组(在我们的例子中,对于所有组,这通常小于 20%)。
As this data is random it means that multiple rows can have the same recordedTimestamps. What we want is to get a list of all of the selected groups in order of groupID with the last recordedTimestamp for each group, and if the same group has more than one matching row like that then the last matching id of those rows.
由于此数据是随机的,这意味着多行可以具有相同的记录时间戳。我们想要的是按照 groupID 的顺序获取所有选定组的列表,每个组的最后一个记录的时间戳,如果同一个组有多个这样的匹配行,那么这些行的最后一个匹配 id。
If hypothetically MySQL had a last() function which returned values from the last row in a special ORDER BY clause then we could simply do:
如果假设 MySQL 有一个 last() 函数,它从一个特殊的 ORDER BY 子句中返回最后一行的值,那么我们可以简单地做:
SELECT
last(t1.id) AS id,
t1.groupID,
last(t1.recordedTimestamp) AS recordedTimestamp,
last(t1.recordedValue) AS recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
ORDER BY t1.recordedTimestamp, t1.id
GROUP BY t1.groupID;
which would only need to examine a few 100 rows in this case as it doesn't use any of the normal GROUP BY functions. This would execute in 0 seconds and hence be highly efficient. Note that normally in MySQL we would see an ORDER BY clause following the GROUP BY clause however this ORDER BY clause is used to determine the ORDER for the last() function, if it was after the GROUP BY then it would be ordering the GROUPS. If no GROUP BY clause is present then the last values will be the same in all of the returned rows.
在这种情况下,它只需要检查 100 行,因为它不使用任何正常的 GROUP BY 函数。这将在 0 秒内执行,因此效率很高。请注意,通常在 MySQL 中,我们会在 GROUP BY 子句之后看到 ORDER BY 子句,但是此 ORDER BY 子句用于确定 last() 函数的 ORDER,如果它在 GROUP BY 之后,则它将对 GROUPS 进行排序。如果不存在 GROUP BY 子句,则所有返回行中的最后一个值将相同。
However MySQL does not have this so let's look at different ideas of what it does have and prove that none of these are efficient.
然而 MySQL 没有这个,所以让我们看看它有什么的不同想法,并证明这些都不是有效的。
Example 1
示例 1
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT t2.id
FROM temperature t2
WHERE t2.groupID = g.id
ORDER BY t2.recordedTimestamp DESC, t2.id DESC
LIMIT 1
);
This examined 3,009,254 rows and took ~0.859 seconds on 5.7.21 and slightly longer on 8.0.4-rc
这检查了 3,009,254 行,在 5.7.21 上花费了大约 0.859 秒,在 8.0.4-rc 上花费的时间稍长
Example 2
示例 2
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
INNER JOIN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
) t5 ON t5.id = t1.id;
This examined 1,505,331 rows and took ~1.25 seconds on 5.7.21 and slightly longer on 8.0.4-rc
这检查了 1,505,331 行,在 5.7.21 上耗时约 1.25 秒,在 8.0.4-rc 上耗时稍长
Example 3
示例 3
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM temperature t1
WHERE t1.id IN (
SELECT max(t2.id) AS id
FROM temperature t2
INNER JOIN (
SELECT t3.groupID, max(t3.recordedTimestamp) AS recordedTimestamp
FROM selected_group g
INNER JOIN temperature t3 ON t3.groupID = g.id
GROUP BY t3.groupID
) t4 ON t4.groupID = t2.groupID AND t4.recordedTimestamp = t2.recordedTimestamp
GROUP BY t2.groupID
)
ORDER BY t1.groupID;
This examined 3,009,685 rows and took ~1.95 seconds on 5.7.21 and slightly longer on 8.0.4-rc
这检查了 3,009,685 行,在 5.7.21 上花费了大约 1.95 秒,在 8.0.4-rc 上花费的时间稍长
Example 4
示例 4
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.id = (
SELECT max(t2.id)
FROM temperature t2
WHERE t2.groupID = g.id AND t2.recordedTimestamp = (
SELECT max(t3.recordedTimestamp)
FROM temperature t3
WHERE t3.groupID = g.id
)
);
This examined 6,137,810 rows and took ~2.2 seconds on 5.7.21 and slightly longer on 8.0.4-rc
这检查了 6,137,810 行,在 5.7.21 上花费了大约 2.2 秒,在 8.0.4-rc 上花费的时间稍长
Example 5
例 5
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
t2.id,
t2.groupID,
t2.recordedTimestamp,
t2.recordedValue,
row_number() OVER (
PARTITION BY t2.groupID ORDER BY t2.recordedTimestamp DESC, t2.id DESC
) AS rowNumber
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
) t1 WHERE t1.rowNumber = 1;
This examined 6,017,808 rows and took ~4.2 seconds on 8.0.4-rc
这检查了 6,017,808 行并在 8.0.4-rc 上花费了大约 4.2 秒
Example 6
例 6
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM (
SELECT
last_value(t2.id) OVER w AS id,
t2.groupID,
last_value(t2.recordedTimestamp) OVER w AS recordedTimestamp,
last_value(t2.recordedValue) OVER w AS recordedValue
FROM selected_group g
INNER JOIN temperature t2 ON t2.groupID = g.id
WINDOW w AS (
PARTITION BY t2.groupID
ORDER BY t2.recordedTimestamp, t2.id
RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
) t1
GROUP BY t1.groupID;
This examined 6,017,908 rows and took ~17.5 seconds on 8.0.4-rc
这检查了 6,017,908 行并在 8.0.4-rc 上花费了大约 17.5 秒
Example 7
例 7
SELECT t1.id, t1.groupID, t1.recordedTimestamp, t1.recordedValue
FROM selected_group g
INNER JOIN temperature t1 ON t1.groupID = g.id
LEFT JOIN temperature t2
ON t2.groupID = g.id
AND (
t2.recordedTimestamp > t1.recordedTimestamp
OR (t2.recordedTimestamp = t1.recordedTimestamp AND t2.id > t1.id)
)
WHERE t2.id IS NULL
ORDER BY t1.groupID;
This one was taking forever so I had to kill it.
这个要花很长时间,所以我不得不杀死它。
回答by M Khalid Junaid
Here is another way to get the last related record using GROUP_CONCAT
with order by and SUBSTRING_INDEX
to pick one of the record from the list
这是使用GROUP_CONCAT
with order by获取最后一条相关记录并SUBSTRING_INDEX
从列表中选择一条记录的另一种方法
SELECT
`Id`,
`Name`,
SUBSTRING_INDEX(
GROUP_CONCAT(
`Other_Columns`
ORDER BY `Id` DESC
SEPARATOR '||'
),
'||',
1
) Other_Columns
FROM
messages
GROUP BY `Name`
Above query will group the all the Other_Columns
that are in same Name
group and using ORDER BY id DESC
will join all the Other_Columns
in a specific group in descending order with the provided separator in my case i have used ||
,using SUBSTRING_INDEX
over this list will pick the first one
上面的查询将对Other_Columns
同一组中的所有内容进行Name
分组,并且 usingORDER BY id DESC
将按Other_Columns
降序加入特定组中的所有内容,并在我使用的情况下使用提供的分隔符||
,SUBSTRING_INDEX
在此列表上使用将选择第一个