MySQL 排序前分组依据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5140785/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
MySQL Order before Group by
提问by Tom
I need to find the latest post for each author and then group the results so I only a single latest post for each author.
我需要找到每个作者的最新帖子,然后对结果进行分组,所以我只为每个作者发布一个最新帖子。
SELECT wp_posts.* FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
ORDER BY wp_posts.post_date DESC
This is correctly grouping the output so I only get one post per author, but it is ordering the results after they have been grouped and not before they have been selected.
这是正确地对输出进行分组,所以我只能得到每个作者的一篇文章,但它是在分组之后而不是在选择之前对结果进行排序。
采纳答案by edze
select wp_posts.* from wp_posts
where wp_posts.post_status='publish'and wp_posts.post_type='post'
group by wp_posts.post_author
having wp_posts.post_date = MAX(wp_posts.post_date) /* ONLY THE LAST POST FOR EACH AUTHOR */
order by wp_posts.post_date desc
select wp_posts.* from wp_posts
where wp_posts.post_status='publish'and wp_posts.post_type='post'
group by wp_posts.post_author
having wp_posts.post_date = MAX(wp_posts.post_date) /* ONLY THE LAST POST FOR EACH AUTHOR */
order by wp_posts.post_date desc
EDIT:
编辑:
After some comments I have decided to add some additional informations.
经过一些评论,我决定添加一些额外的信息。
The company I am working at also uses Postgres and especially SQL Server. This databases don't allow such queries. So I know that there is a other way to do this (I write a solution below). You shoud also have to know what you do if you don't group by all columns treated in the projection or use aggregate functions. Otherwise let it be!
我工作的公司也使用 Postgres,尤其是 SQL Server。此数据库不允许此类查询。所以我知道还有其他方法可以做到这一点(我在下面写了一个解决方案)。如果您不按投影中处理的所有列进行分组或使用聚合函数,您还应该知道自己在做什么。否则就这样吧!
I chose the solution above, because it's a specific question. Tom want to get the recent post for each author in a wordpress site. In my mind it is negligible for the analysis if a author do more than one post per second. Wordpress should even forbid it by its spam-double-post detection. I know from personal experience that there is a really significant benefit in performance doing a such dirty group by with MySQL. But if you know what you do, then you can do it! I have such dirty groups in apps where I'm professionally accountable for. Here I have tables with some mio rows which need 5-15s instead of 100++ seconds.
我选择了上面的解决方案,因为这是一个特定的问题。Tom 想获取 wordpress 站点中每位作者的最新帖子。在我看来,如果一个作者每秒发表一篇以上的文章,那么分析可以忽略不计。Wordpress 甚至应该通过它的垃圾邮件双重发布检测来禁止它。我从个人经验中知道,使用 MySQL 执行如此脏的组对性能有非常显着的好处。但如果你知道你在做什么,那么你就可以做到!我在专业负责的应用程序中有如此肮脏的团体。在这里,我有一些需要 5-15 秒而不是 100++ 秒的 mio 行的表。
May be useful about some pros and cons: http://ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html
可能对一些利弊有用:http: //ftp.nchu.edu.tw/MySQL/tech-resources/articles/debunking-group-by-myths.html
SELECT
wp_posts.*
FROM
wp_posts
JOIN
(
SELECT
g.post_author
MAX(g.post_date) AS post_date
FROM wp_posts as g
WHERE
g.post_status='publish'
AND g.post_type='post'
GROUP BY g.post_author
) as t
ON wp_posts.post_author = t.post_author AND wp_posts.post_date = t.post_date
ORDER BY wp_posts.post_date
But if here is more then one post per second for a author you will get more then one row and not the only last one.
但是,如果作者每秒发布多于一篇文章,您将获得多于一行而不是唯一的最后一行。
Now you can spin the wheel again and get the post with the highest Id
. Even here it is at least not guaranteed that you really get the last one.
现在您可以再次旋转轮子并获得最高的帖子Id
。即使在这里,至少也不能保证你真的得到最后一个。
回答by Lieven Keersmaekers
Not sure if I understand your requirement correct but following inner statement gets the list of the latest post_date for each author and joins these back with the wp_posts table to get a complete record.
不确定我是否理解您的要求是否正确,但以下内部语句会获取每个作者的最新 post_date 列表,并将这些列表与 wp_posts 表连接起来以获得完整记录。
SELECT *
FROM wp_posts wp
INNER JOIN (
SELECT post_author
, MAX(post_date) AS post_date
FROM wp_posts
WHERE post_status = 'publish'
AND post_type = 'post'
GROUP BY
post.author
) wpmax ON wpmax.post_author = wp.post_author
AND wpmax.post_date = wp.post_date
ORDER BY
wp.post_date DESC
回答by aanton
I think that @edze response is wrong.
我认为@edze 的回应是错误的。
In the MySQL manualyou can read:
在MySQL 手册中,您可以阅读:
MySQL extends the use of GROUP BY so that the select list can refer to nonaggregated columns not named in the GROUP BY clause. You can use this feature to get better performance by avoiding unnecessary column sorting and grouping. However, this is useful primarily when all values in each nonaggregated column not named in the GROUP BY are the same for each group. The server is free to choose any value from each group, so unless they are the same, the values chosen are indeterminate. Furthermore, the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses.
MySQL 扩展了 GROUP BY 的使用,以便选择列表可以引用未在 GROUP BY 子句中命名的非聚合列。您可以使用此功能通过避免不必要的列排序和分组来获得更好的性能。但是,这主要在未在 GROUP BY 中命名的每个非聚合列中的所有值对于每个组都相同时很有用。服务器可以自由地从每个组中选择任何值,因此除非它们相同,否则选择的值是不确定的。此外,添加 ORDER BY 子句不会影响从每个组中选择值。结果集的排序发生在选择值之后,并且 ORDER BY 不影响服务器选择的值。
Two great references:
两个很好的参考:
- http://kristiannielsen.livejournal.com/6745.html
- http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
- http://kristiannielsen.livejournal.com/6745.html
- http://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/
Sorry, but I can not comment the @edze response because of my reputation, so I have written a new answer.
抱歉,但由于我的声誉,我无法评论 @edze 的回复,所以我写了一个新的答案。
回答by 11101101b
Do a GROUP BY after the ORDER BY by wrapping your query with the GROUP BY like this:
通过像这样用 GROUP BY 包装您的查询,在 ORDER BY 之后执行 GROUP BY:
SELECT t.* FROM (SELECT * FROM table ORDER BY time DESC) t GROUP BY t.author
回答by Husky110
it doesn't matter if you order before or after the group-statement, because order means only that 213 goes to 123 or 321 and not more. group by takes only SOME entry per column, not only the latest. I consider you working with subselects here like
在 group-statement 之前或之后订购都没有关系,因为 order 仅意味着 213 到 123 或 321 而不是更多。group by 每列只取一些条目,而不仅仅是最新的。我认为你在这里使用子选择
SELECT wp_posts.* FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
AND wp_posts.post_date = (Select max(post_date) from wp_posts where author = ... )
回答by Alex
What do you think about this?? Seems to work for me
你怎么看待这件事??似乎对我有用
SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
It brings me all the Authors with the most updated post_date ... Do you identify a problem there?? I don't
它给我带来了所有作者最新的 post_date ......你发现那里有问题吗??我不
回答by Mhd Jazaery
SELECT wp_posts.*,max(wp_posts.post_date) FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
回答by Kausha Mehta
Use the below code...
使用下面的代码...
<?php
//get all users, iterate through users, query for one post for the user,
//if there is a post then display the post title, author, content info
$blogusers = get_users_of_blog();
if ($blogusers) {
foreach ($blogusers as $bloguser) {
$args = array(
'author' => $bloguser->user_id,
'showposts' => 1,
'caller_get_posts' => 1
);
$my_query = new WP_Query($args);
if( $my_query->have_posts() ) {
// $user = get_userdata($bloguser->user_id);
// echo 'This is one post for author with User ID: ' . $user->ID . ' ' . $user- >user_firstname . ' ' . $user->user_lastname;
while ($my_query->have_posts()) : $my_query->the_post(); ?>
<a href="<?php the_permalink() ?>" rel="bookmark" title="Permanent Link to <? php the_title_attribute(); ?>"><?php the_title(); ?></a>
<small><?php the_time('F jS, Y') ?> by <?php the_author_posts_link() ?> </small><?php
the_content();
endwhile;
}
}
}
?>
回答by Black Bronco
HERE a simple answer from http://www.cafewebmaster.com/mysql-order-sort-group
这是来自http://www.cafewebmaster.com/mysql-order-sort-group的简单答案
SELECT * FROM
(
select * from `my_table` order by timestamp desc
) as my_table_tmp
GROUP BY catid
ORDER BY nid desc
it worked wonders for me
它为我创造了奇迹
回答by Moshe L
When our table became large, performance need to checked also. I checked all the options in the questions here, with a PM system with a 136K messages and link table with 83K rows.
当我们的表变大时,性能也需要检查。我在这里检查了问题中的所有选项,使用具有 136K 消息的 PM 系统和具有 83K 行的链接表。
When you need only count, or only IDs - Alex's solution is the best.
当您只需要计数或只需要 ID 时 - Alex 的解决方案是最好的。
SELECT wp_posts.post_author, MAX(wp_posts.post_date), wp_posts.status, wp_posts.post_type
FROM wp_posts
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
GROUP BY wp_posts.post_author
When you need other fields, I need to modify Husky110 solution (to my table design - here it is only example - not checked), that in my tables 10x faster than the subquery option:
当您需要其他字段时,我需要修改 Husky110 解决方案(根据我的表设计 - 这里只是示例 - 未选中),在我的表中比子查询选项快 10 倍:
SELECT wp_posts.* FROM wp_posts,
(Select post_id as pid, max(post_date) maxdate from wp_posts where author = ... group by author order by maxdate desc limit 4) t
WHERE wp_posts.post_status='publish'
AND wp_posts.post_type='post'
AND wp_posts.post_id = pid
This change can select more than one post (one for user, for example), and can be modified to other solutions.
此更改可以选择多个帖子(例如,一个给用户),并且可以修改为其他解决方案。
Moshe.
摩西。