oracle 比较组按 VS 过分区按

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9328238/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 00:37:17  来源:igfitidea点击:

Comparison Group by VS Over Partition By

sqloracledatabase-performance

提问by Mik378

Assuming one table CARwith two columns CAR_ID (int)and VERSION (int).

假设一张表CAR有两列CAR_ID (int)VERSION (int).

I want to retrieve the maximum version of each car.

我想检索每辆车的最大版本。

So there are two solutions (at least) :

所以有两种解决方案(至少):

select car_id, max(version) as max_version 
  from car  
 group by car_id;

Or :

或者 :

select car_id, max_version 
  from  ( select car_id, version
               , max(version) over (partition by car_id) as max_version
            from car
                ) max_ver  
 where max_ver.version = max_ver.max_version

Are these two queries similarly performant?

这两个查询的性能相似吗?

采纳答案by Java

Yes It may affects

是的 可能会影响

Second query is an example of Inline View. It's a very useful method for performing reports with various types of counts or use of any aggregate functions with it.

第二个查询是内联视图的示例。这是一种非常有用的方法,可用于执行具有各种类型计数或使用任何聚合函数的报告。

Oracle executes the subquery and then uses the resulting rows as a view in the FROM clause.

Oracle 执行子查询,然后将结果行用作 FROM 子句中的视图。

As we consider about performance , always recommend inline view instead of choosing another subquery type.

当我们考虑性能时,总是推荐内联视图而不是选择其他子查询类型。

And one more thing second query will give all max records,while first one will give you only one max record.

还有一件事,第二个查询将提供所有最大记录,而第一个查询只会给您一个最大记录。

see here

看这里

回答by JustDave

I know this is extremely old but thought it should be pointed out.

我知道这是非常古老的,但认为应该指出。

select car_id, max_version 
  from (select car_id
             , version
             , max(version) over (partition by car_id) as max_version
          from car ) max_ver  
 where max_ver.version = max_ver.max_version

Not sure why you did option two like that... in this case the sub select should be theoretically slower because your selecting from the same table 2x and then joining the results back to itself.

不知道为什么你做这样的选项二......在这种情况下,子选择理论上应该更慢,因为你从同一个表中选择 2x 然后将结果连接回自身。

Just remove version from your inline view and they are the same thing.

只需从您的内联视图中删除版本,它们是一样的。

select car_id, max(version) over (partition by car_id) as max_version
  from car

The performance really depends on the optimizer in this situation, but yes the as original answer suggests inline views as they do narrow results. Though this is not a good example being its the same table with no filters in the selections given.

在这种情况下,性能确实取决于优化器,但是是的,原始答案建议内联视图,因为它们确实缩小了结果。虽然这不是一个很好的例子,因为它是同一个表,在给定的选择中没有过滤器。

Partitioning is also helpful when you are selecting a lot of columns but need different aggregations that fit the result set. Otherwise you are forced to group by every other column.

当您选择很多列但需要适合结果集的不同聚合时,分区也很有用。否则,您将被迫每隔一列进行分组。

回答by Dave Markle

It will depend on your indexing scheme and the amount of data in the table. The optimizer will likely make different decisions based on the data that's actually inside the table.

这将取决于您的索引方案和表中的数据量。优化器可能会根据实际在表中的数据做出不同的决定。

I have found, at least in SQL Server (I know you asked about Oracle) that the optimizer is more likely to perform a full scan with the PARTITION BY query vs the GROUP BY query. But that's only in cases where you have an index which contains CAR_ID and VERSION (DESC) in it.

我发现,至少在 SQL Server 中(我知道你问过 Oracle),优化器更有可能使用 PARTITION BY 查询与 GROUP BY 查询执行完整扫描。但这仅适用于您的索引中包含 CAR_ID 和 VERSION (DESC) 的情况。

The moral of the story is that I would test thoroughly to choose the right one. For small tables, it doesn't matter. For really, really big data sets, neither may be fast...

这个故事的寓意是我会彻底测试以选择正确的。对于小桌子,没关系。对于非常非常大的数据集,两者都可能很快......