SQL 如何有效地选择小于和大于给定值的最接近值?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8938234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How can I select the nearest value less-than and greater-than a given value efficiently?
提问by brianestey
I have two tables, one for values one for location and am trying to interpolate location. The tables have been simplified to the following:
我有两个表,一个用于值,一个用于位置,我正在尝试插入位置。这些表已简化为以下内容:
CREATE TABLE value(
Timestamp DATETIME2,
Value float NOT NULL,
PRIMARY KEY(Timestamp)
);
CREATE TABLE location(
Timestamp DATETIME2,
Position INT NOT NULL,
PRIMARY KEY(Timestamp)
);
INSERT INTO value VALUES
('2011/12/1 16:55:01', 1),
('2011/12/1 16:55:02', 5),
('2011/12/1 16:55:05', 10),
('2011/12/1 16:55:08', 6);
INSERT INTO location VALUES
('2011/12/1 16:55:00', 0),
('2011/12/1 16:55:05', 10),
('2011/12/1 16:55:10', 5)
The expected results would be
预期的结果是
TimeStamp, Value, LowerTime, LowerLocation, UpperTime, UpperLocation
2011-12-01 16:55:01, 1, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:02, 5, 2011-12-01 16:55:00, 0, 2011-12-01 16:55:05, 10
2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:05, 10
2011-12-01 16:55:08, 6, 2011-12-01 16:55:05, 10, 2011-12-01 16:55:10, 5
(Keep in mind this is simplified sample data to get the idea of the query I am trying to perform across.)
(请记住,这是简化的示例数据,以了解我正在尝试执行的查询。)
To do the interpolation, I need to figure out the time and locations before and after a given values time. I am currently doing this with a query that looks like:
要进行插值,我需要找出给定值时间前后的时间和位置。我目前正在使用如下查询执行此操作:
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
FROM
dbo.value V
Now this works, but this obviously is doing a lot of work. I'm thinking there must be a query simplification that I'm missing but I've been playing with it all morning and haven't come up with anything concrete. Hoping someone here has a better idea.
现在这可行,但这显然做了很多工作。我想一定有一个我遗漏的查询简化,但我整个上午都在玩它并且没有想出任何具体的东西。希望这里有人有更好的主意。
I am currently exploring if there is a way to figure out the LowerTime and UpperTime and use those in determining the Locations. Something like:
我目前正在探索是否有办法找出 LowerTime 和 UpperTime 并使用它们来确定位置。就像是:
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT Position FROM dbo.location WHERE Timestamp = LowerTime) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT Position FROM dbo.location WHERE Timestamp = UpperTime) as UpperLocation
FROM
dbo.value V
but this doesn't work.
但这不起作用。
EDIT1: Updated query as suggested. However no visible change in execution time.
EDIT1:按照建议更新查询。但是,执行时间没有明显变化。
EDIT2: Added my thoughts of the approach I am currently trying.
EDIT2:添加了我对我目前正在尝试的方法的想法。
回答by Sergey Kudriavtsev
For simplicity you may at least use MAX()
and MIN()
functions for querying timestamp
field instead of TOP 1
and ORDER BY
.
为简单起见,您至少可以使用MAX()
andMIN()
函数来查询timestamp
字段而不是TOP 1
and ORDER BY
。
Full query will be
完整的查询将是
SELECT
V.Timestamp,
V.Value,
(SELECT MAX(Timestamp) FROM dbo.location WHERE Timestamp <= V.Timestamp) as LowerTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp <= V.Timestamp ORDER BY timestamp DESC) as LowerLocation,
(SELECT MIN(Timestamp) FROM dbo.location WHERE Timestamp >= V.Timestamp) as UpperTime,
(SELECT TOP 1 Position FROM dbo.location WHERE Timestamp >= V.Timestamp ORDER BY timestamp ASC) as UpperLocation
FROM
dbo.value V
回答by Damien_The_Unbeliever
This might do the trick (although I think the join looks quite ugly):
这可能会奏效(虽然我认为连接看起来很丑陋):
;with OrderedLocations as (
select
v.Timestamp,
v.Value,
l.Timestamp as tsl,
l.Position,
ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp <= v.Timestamp THEN l.Timestamp ELSE '00010101' END desc) as PrevRN,
ROW_NUMBER() OVER (PARTITION BY v.Timestamp ORDER BY CASE WHEN l.Timestamp >= v.Timestamp THEN l.Timestamp ELSE '99991231' END asc) as NextRN
from
value v
cross join
location l
)
select
ol1.Timestamp,
ol1.Value,
ol1.tsl,
ol1.Position,
ol2.tsl,
ol2.Position
from
OrderedLocations ol1
inner join
OrderedLocations ol2
on
ol1.Timestamp = ol2.Timestamp and
ol1.Value = ol2.Value
where
ol1.PrevRN = 1 and
ol2.NextRN = 1
Unfortunately, as with most efficiency/performance questions, the answer tends to be try lots of different combinations with your actualtables and data, and measure how each one performs.
不幸的是,与大多数效率/性能问题一样,答案往往是尝试使用实际表和数据进行多种不同组合,并衡量每个组合的表现。
An alternative (avoiding the join) using the same CTE as above would be:
使用与上述相同的 CTE 的替代方案(避免连接)是:
SELECT Timestamp,Value,
MAX(CASE WHEN PrevRN=1 THEN tsl END),MAX(CASE WHEN PrevRN=1 then Position END),
MAX(CASE WHEN NextRN=1 THEN tsl END),MAX(CASE WHEN NextRN=1 then Position END)
FROM
OrderedLocations
where PrevRN=1 or NextRN=1
group by Timestamp,Value
The CTE (OrderedLocations
) is trying to construct a rowset where every row from location is matched to every row in value
. For each resulting row, we calculate two ROW_NUMBER
s - the row number where we number all rows with a lower or equal timestamp (PrevRN
) in descending order, and another where we number all rows with a greater or equal timestamp (NextRN
) ascending. We then construct our final result by just considering those rows where one of those row numbers is 1.
CTE( OrderedLocations
) 试图构造一个行集,其中 location 中的每一行都与value
. 对于每个结果行,我们计算两个ROW_NUMBER
s - 行号,其中我们PrevRN
以降序对具有较低或相等时间戳 ( ) 的所有行进行编号,另一个是我们对具有大于或等于时间戳 ( NextRN
) 的所有行进行升序编号。然后我们只考虑那些行号之一为 1 的行来构建我们的最终结果。