SQL SQL查询以获取最新价格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/49404/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 23:17:14  来源:igfitidea点击:

SQL Query to get latest price

sqlsql-serversql-server-2005

提问by Marius

I have a table containing prices for a lot of different "things" in a MS SQL 2005 table. There are hundreds of records per thing per day and the different things gets price updates at different times.

我有一个表,其中包含 MS SQL 2005 表中许多不同“事物”的价格。每件物品每天有数百条记录,不同的物品在不同时间获得价格更新。

ID uniqueidentifier not null,
ThingID int NOT NULL,
PriceDateTime datetime NOT NULL,
Price decimal(18,4) NOT NULL

I need to get today's latest prices for a group of things. The below query works but I'm getting hundreds of rows back and I have to loop trough them and only extract the latest one per ThingID. How can I (e.g. via a GROUP BY) say that I want the latest one per ThingID? Or will I have to use subqueries?

我需要获取一组商品今天的最新价格。下面的查询有效,但我得到了数百行,我必须循环遍历它们,并且每个 ThingID 只提取最新的一行。我如何(例如通过 GROUP BY)说我想要每个 ThingID 的最新一个?还是我必须使用子查询?

SELECT * 
FROM Thing
WHERE ThingID IN (1,2,3,4,5,6)
  AND PriceDate > cast( convert(varchar(20), getdate(), 106) as DateTime) 

UPDATE:In an attempt to hide complexity I put the ID column in a an int. In real life it is GUID (and not the sequential kind). I have updated the table def above to use uniqueidentifier.

更新:为了隐藏复杂性,我将 ID 列放在一个 int 中。在现实生活中,它是 GUID(而不是顺序类型)。我已经更新了上面的表 def 以使用 uniqueidentifier。

回答by BlaM

I think the only solution with your table structure is to work with a subquery:

我认为您的表结构的唯一解决方案是使用子查询:

SELECT *
   FROM Thing
   WHERE ID IN (SELECT max(ID) FROM Thing 
                   WHERE ThingID IN (1,2,3,4)
                   GROUP BY ThingID)

(Given the highest ID also means the newest price)

(给定最高的ID也意味着最新的价格)

However I suggest you add a "IsCurrent" column that is 0 if it's not the latest price or 1 if it is the latest. This will add the possible risk of inconsistent data, but it will speed up the whole process a lot when the table gets bigger (if it is in an index). Then all you need to do is to...

但是,我建议您添加一个“IsCurrent”列,如果不是最新价格,则为 0,如果是最新价格,则为 1。这会增加数据不一致的可能风险,但是当表变大时(如果它在索引中)会大大加快整个过程。那么你需要做的就是...

SELECT *
   FROM Thing
   WHERE ThingID IN (1,2,3,4)
     AND IsCurrent = 1

UPDATE

更新

Okay, Markus updated the question to show that ID is a uniqueid, not an int. That makes writing the query even more complex.

好的,Markus 更新了问题以表明 ID 是一个 uniqueid,而不是一个 int。这使得编写查询更加复杂。

SELECT T.* 
   FROM Thing T
   JOIN (SELECT ThingID, max(PriceDateTime)
            WHERE ThingID IN (1,2,3,4)
            GROUP BY ThingID) X ON X.ThingID = T.ThingID 
                                AND X.PriceDateTime = T.PriceDateTime
   WHERE ThingID IN (1,2,3,4)

I'd really suggest using either a "IsCurrent" column or go with the other suggestion found in the answers and use "current price" table and a separate "price history" table (which would ultimately be the fastest, because it keeps the price table itself small).

我真的建议使用“IsCurrent”列或使用答案中的其他建议并使用“当前价格”表和单独的“价格历史”表(最终将是最快的,因为它保持价格桌子本身很小)。

(I know that the ThingID at the bottom is redundant. Just try if it is faster with or without that "WHERE". Not sure which version will be faster after the optimizer did its work.)

(我知道底部的 ThingID 是多余的。试试看有没有“WHERE”它会更快。不确定优化器完成工作后哪个版本会更快。)

回答by Mark Nold

I would try something like the following subquery and forget about changing your data structures.

我会尝试类似下面的子查询,而忘记更改您的数据结构。

SELECT
 *
FROM
 Thing
WHERE 
 (ThingID, PriceDateTime) IN 
 (SELECT 
   ThingID, 
   max(PriceDateTime ) 
  FROM 
   Thing 
  WHERE 
   ThingID IN (1,2,3,4)
  GROUP BY 
   ThingID
 )

Editthe above is ANSI SQL and i'm now guessing having more than one column in a subquery doesnt work for T SQL. Marius, I can't test the following but try;

编辑上面的是 ANSI SQL,我现在猜测在子查询中有多于一列不适用于 T SQL。Marius,我无法测试以下内容,但请尝试;

SELECT
 p.*
FROM
 Thing p,
 (SELECT ThingID, max(PriceDateTime ) FROM Thing WHERE ThingID IN (1,2,3,4) GROUP BY ThingID) m
WHERE 
 p.ThingId = m.ThingId
 and p.PriceDateTime = m.PriceDateTime

another option might be to change the date to a string and concatenate with the id so you have only one column. This would be slightly nasty though.

另一种选择可能是将日期更改为字符串并与 id 连接,这样您就只有一列。虽然这会有点讨厌。

回答by Nick Pierpoint

If the subquery route was too slow I would look at treating your price updates as an audit log and maintaining a ThingPrice table - perhaps as a trigger on the price updates table:

如果子查询路由太慢,我会将您的价格更新视为审核日志并维护一个 ThingPrice 表 - 也许作为价格更新表上的触发器:

ThingID int not null,
UpdateID int not null,
PriceDateTime datetime not null,
Price decimal(18,4) not null

The primary key would just be ThingID and "UpdateID" is the "ID" in your original table.

主键只是 ThingID,“UpdateID”是原始表中的“ID”。

回答by Pop Catalin

Since you are using SQL Server 2005, you can use the new (CROSS|OUTTER) APPLY clause. The APPLY clause let's you join a table with a table valued function.

由于您使用的是 SQL Server 2005,您可以使用新的 (CROSS|OUTTER) APPLY 子句。APPLY 子句让您使用表值函数连接表。

To solve the problem, first define a table valued function to retrieve the top n rows from Thing for a specific id, date ordered:

为了解决这个问题,首先定义一个表值函数来从 Thing 中检索特定 id 的前 n 行,日期排序:

CREATE FUNCTION dbo.fn_GetTopThings(@ThingID AS GUID, @n AS INT)
  RETURNS TABLE
AS
RETURN
  SELECT TOP(@n) *
  FROM Things
  WHERE ThingID= @ThingID
  ORDER BY PriceDateTime DESC
GO

and then use the function to retrieve the top 1 records in a query:

然后使用该函数检索查询中的前 1 条记录:

SELECT *
   FROM Thing t
CROSS APPLY dbo.fn_GetTopThings(t.ThingID, 1)
WHERE t.ThingID IN (1,2,3,4,5,6)

The magic here is done by the APPLY clause which applies the function to every row in the left result setthen joins with the result set returned by the function then retuns the final result set. (Note: to do a left join like apply, use OUTTER APPLY which returns all rows from the left side, while CROSS APPLY returns only the rows that have a match in the right side)

这里的魔法是由 APPLY 子句完成的,该子句将函数应用于左侧结果集中的每一行,然后与函数返回的结果集连接,然后重新调整最终结果集。(注意:要进行像应用这样的左连接,请使用 OUTTER APPLY 返回左侧的所有行,而 CROSS APPLY 仅返回右侧匹配的行)

BlaM: Because I can't post comments yet( due to low rept points) not even to my own answers ^^, I'll answer in the body of the message: -the APPLY clause even, if it uses table valued functions it is optimized internally by SQL Server in such a way that it doesn't call the function for every row in the left result set, but instead takes the inner sql from the function and converts it into a join clause with the rest of the query, so the performance is equivalent or even better (if the plan is chosen right by sql server and further optimizations can be done) than the performance of a query using subqueries), and in my personal experience APPLY has no performance issues when the database is properly indexed and statistics are up to date (just like a normal query with subqueries behaves in such conditions)

BlaM:因为我还不能发表评论(由于低 rept 点)甚至我自己的答案 ^^,我会在消息正文中回答: - APPLY 子句,如果它使用表值函数它由 SQL Server 内部优化,它不会为左侧结果集中的每一行调用该函数,而是从函数中获取内部 sql 并将其转换为与查询的其余部分的连接子句,所以性能与使用子查询的查询性能相当甚至更好(如果 sql server 选择了正确的计划并且可以进行进一步的优化),并且根据我的个人经验,当数据库正确时,APPLY 没有性能问题索引和统计信息是最新的(就像带有子查询的普通查询在这种情况下的行为一样)

回答by BirgerH

I'm converting the uniqueidentifier to a binary so that I can get a MAX of it. This should make sure that you won't get duplicates from multiple records with identical ThingIDs and PriceDateTimes:

我正在将 uniqueidentifier 转换为二进制文件,以便我可以获得它的 MAX。这应该确保您不会从具有相同 ThingID 和 PriceDateTimes 的多个记录中获得重复项:

SELECT * FROM Thing WHERE CONVERT(BINARY(16),Thing.ID) IN
(
 SELECT MAX(CONVERT(BINARY(16),Thing.ID))
  FROM Thing
  INNER JOIN
   (SELECT ThingID, MAX(PriceDateTime) LatestPriceDateTime FROM Thing
    WHERE PriceDateTime >= CAST(FLOOR(CAST(GETDATE() AS FLOAT)) AS DATETIME)
    GROUP BY ThingID) LatestPrices
  ON Thing.ThingID = LatestPrices.ThingID
   AND Thing.PriceDateTime = LatestPrices.LatestPriceDateTime
 GROUP BY Thing.ThingID, Thing.PriceDateTime
) AND Thing.ThingID IN (1,2,3,4,5,6)

回答by Peter

Since ID is not sequential, I assume you have a unique index on ThingID and PriceDateTime so only one price can be the most recent for a given item.

由于 ID 不是连续的,我假设您在 ThingID 和 PriceDateTime 上有一个唯一索引,因此对于给定的项目,只有一个价格可以是最新的。

This query will get all of the items in the list IF they were priced today. If you remove the where clause for PriceDate you will get the latest price regardless of date.

如果今天定价,此查询将获取列表中的所有项目。如果删除 PriceDate 的 where 子句,无论日期如何,您都将获得最新价格。

SELECT * 
FROM Thing thi
WHERE thi.ThingID IN (1,2,3,4,5,6)
  AND thi.PriceDateTime =
     (SELECT MAX(maxThi.PriceDateTime)
      FROM Thing maxThi
      WHERE maxThi.PriceDateTime >= CAST( CONVERT(varchar(20), GETDATE(), 106) AS DateTime)
        AND maxThi.ThingID = thi.ThingID)

Note that I changed ">" to ">=" since you could have a price right at the start of a day

请注意,我将“>”更改为“>=”,因为您可以在一天开始时确定价格

回答by Galwegian

It depends on the nature of how your data will be used, but if the old price data will not be used nearly as regularly as the current price data, there may be an argument here for a price history table. This way, non-current data may be archived off to the price history table (probably by triggers) as the new prices come in.

这取决于您的数据将如何使用的性质,但如果旧的价格数据不会像当前价格数据那样经常使用,那么这里可能会有一个价格历史表的争论。这样,当新价格出现时,非当前数据可能会被归档到价格历史表(可能由触发器)。

As I say, depending on your access model, this could be an option.

正如我所说,根据您的访问模式,这可能是一种选择。

回答by MobyDX

Try this (provided you only need the latest price, not the identifier or datetime of that price)

试试这个(前提是您只需要最新的价格,而不是该价格的标识符或日期时间)

SELECT ThingID, (SELECT TOP 1 Price FROM Thing WHERE ThingID = T.ThingID ORDER BY PriceDateTime DESC) Price
FROM Thing T
WHERE ThingID IN (1,2,3,4) AND DATEDIFF(D, PriceDateTime, GETDATE()) = 0
GROUP BY ThingID

回答by 30thh

It must work without using a global PK column (for complex primary keys for example):

它必须在不使用全局 PK 列的情况下工作(例如对于复杂的主键):

SELECT t1.*, t2.PriceDateTime AS bigger FROM Prices t1 
LEFT JOIN Prices t2 ON t1.ThingID = t2.ThingID AND t1.PriceDateTime < t2.PriceDateTime 
HAVING t2.PriceDateTime IS NULL

回答by Novalis

maybe i missunderstood the taks but what about a:

也许我误解了 taks,但是一个:

SELECT ID, ThingID, max(PriceDateTime), Price FROM Thing GROUP BY ThingID

SELECT ID, ThingID, max(PriceDateTime), Price FROM Thing GROUP BY ThingID