SQL QUERY 用先前已知值中的值替换一行中的 NULL 值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1345065/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 03:22:41  来源:igfitidea点击:

SQL QUERY replace NULL value in a row with a value from the previous known value

sqlnull

提问by

I have 2 columns

我有 2 列

date   number       
----   ------
1      3           
2      NULL        
3      5           
4      NULL        
5      NULL        
6      2          
.......

I need to replace the NULL values with new values takes on the value from the last known value in the previous date in the date column eg: date=2 number = 3, date 4 and 5 number = 5 and 5. The NULL values appear randomly.

我需要用新值替换 NULL 值,该值取自日期列中上一个日期的最后一个已知值,例如:date=2 number = 3, date 4 and 5 number = 5 and 5. NULL 值出现随机。

回答by Adriaan Stander

If you are using Sql Server this should work

如果您使用的是 Sql Server,这应该可以工作

DECLARE @Table TABLE(
        ID INT,
        Val INT
)

INSERT INTO @Table (ID,Val) SELECT 1, 3
INSERT INTO @Table (ID,Val) SELECT 2, NULL
INSERT INTO @Table (ID,Val) SELECT 3, 5
INSERT INTO @Table (ID,Val) SELECT 4, NULL
INSERT INTO @Table (ID,Val) SELECT 5, NULL
INSERT INTO @Table (ID,Val) SELECT 6, 2


SELECT  *,
        ISNULL(Val, (SELECT TOP 1 Val FROM @Table WHERE ID < t.ID AND Val IS NOT NULL ORDER BY ID DESC))
FROM    @Table t

回答by Bill Karwin

Here's a MySQL solution:

这是一个 MySQL 解决方案:

UPDATE mytable
SET number = (@n := COALESCE(number, @n))
ORDER BY date;

This is concise, but won't necessary work in other brands of RDBMS. For other brands, there might be a brand-specific solution that is more relevant. That's why it's important to tell us the brand you're using.

这很简洁,但在其他品牌的 RDBMS 中不需要。对于其他品牌,可能会有更相关的品牌特定解决方案。这就是为什么告诉我们您正在使用的品牌很重要的原因。

It's nice to be vendor-independent, as @Pax commented, but failing that, it's also nice to use your chosen brand of database to its fullest advantage.

正如@Pax 评论的那样,独立于供应商是很好的,但如果失败,那么充分利用您选择的数据库品牌也很好。



Explanation of the above query:

上述查询的解释:

@nis a MySQL user variable. It starts out NULL, and is assigned a value on each row as the UPDATE runs through rows. Where numberis non-NULL, @nis assigned the value of number. Where numberis NULL, the COALESCE()defaults to the previous value of @n. In either case, this becomes the new value of the numbercolumn and the UPDATE proceeds to the next row. The @nvariable retains its value from row to row, so subsequent rows get values that come from the prior row(s). The order of the UPDATE is predictable, because of MySQL's special use of ORDER BY with UPDATE (this is not standard SQL).

@n是一个 MySQL 用户变量。它以 NULL 开始,并在 UPDATE 遍历行时为每一行分配一个值。其中number非 NULL,@n被赋值为number。其中number为 NULL,COALESCE()默认为 之前的值@n。在任何一种情况下,这都会成为该number列的新值并且 UPDATE 进行到下一行。该@n变量在一行之间保留其值,因此后续行将获得来自前一行的值。UPDATE 的顺序是可预测的,因为 MySQL 特殊使用 ORDER BY 和 UPDATE(这不是标准 SQL)。

回答by voutmaster

The best solution is the one offered by Bill Karwin. I recently had to solve this in a relatively large resultset (1000 rows with 12 columns each needing this type of "show me last non-null value if this value is null on the current row") and using the update method with a top 1 select for the previous known value (or subquery with a top 1 ) ran super slow.

最好的解决方案是 Bill Karwin 提供的解决方案。我最近不得不在一个相对较大的结果集中解决这个问题(1000 行,每行 12 列,每行都需要这种类型的“如果当前行的值为空,则显示最后一个非空值”)并使用更新方法与前 1 select 之前的已知值(或带有 top 1 的子查询)运行速度非常慢。

I am using SQL 2005 and the syntax for a variable replacement is slightly different than mysql:

我使用的是 SQL 2005,变量替换的语法与 mysql 略有不同:

UPDATE mytable 
SET 
    @n = COALESCE(number, @n),
    number = COALESCE(number, @n)
ORDER BY date

The first set statement updates the value of the variable @n to the current row's value of 'number' if the 'number' is not null (COALESCE returns the first non-null argument you pass into it) The second set statement updates the actual column value for 'number' to itself (if not null) or the variable @n (which always contains the last non NULL value encountered).

如果 'number' 不为空,则第一个 set 语句将变量 @n 的值更新为当前行的 'number' 值(COALESCE 返回您传递给它的第一个非空参数)第二个 set 语句更新实际'number' 的列值到它自己(如果不为 null)或变量 @n(它总是包含遇到的最后一个非 NULL 值)。

The beauty of this approach is that there are no additional resources expended on scanning the temporary table over and over again... The in-row update of @n takes care of tracking the last non-null value.

这种方法的美妙之处在于没有额外的资源花费在一遍又一遍地扫描临时表上……@n 的行内更新负责跟踪最后一个非空值。

I don't have enough rep to vote his answer up, but someone should. It's the most elegant and best performant.

我没有足够的代表来投票支持他的答案,但有人应该这样做。它是最优雅、性能最好的。

回答by APC

Here is the Oracle solution (10g or higher). It uses the analytic function last_value()with the ignore nullsoption, which substitutes the last non-null value for the column.

这是 Oracle 解决方案(10g 或更高版本)。它使用last_value()带有ignore nulls选项的分析函数,该选项替换列的最后一个非空值。

SQL> select *
  2  from mytable
  3  order by id
  4  /

        ID    SOMECOL
---------- ----------
         1          3
         2
         3          5
         4
         5
         6          2

6 rows selected.

SQL> select id
  2         , last_value(somecol ignore nulls) over (order by id) somecol
  3  from mytable
  4  /

        ID    SOMECOL
---------- ----------
         1          3
         2          3
         3          5
         4          5
         5          5
         6          2

6 rows selected.

SQL>

回答by Gerardo Lima

The following script solves this problem and only uses plain ANSI SQL. I tested this solution on SQL2008, SQLite3and Oracle11g.

下面的脚本解决了这个问题,并且只使用了普通的 ANSI SQL。我在SQL2008SQLite3Oracle11g上测试了这个解决方案。

CREATE TABLE test(mysequence INT, mynumber INT);

INSERT INTO test VALUES(1, 3);
INSERT INTO test VALUES(2, NULL);
INSERT INTO test VALUES(3, 5);
INSERT INTO test VALUES(4, NULL);
INSERT INTO test VALUES(5, NULL);
INSERT INTO test VALUES(6, 2);

SELECT t1.mysequence, t1.mynumber AS ORIGINAL
, (
    SELECT t2.mynumber
    FROM test t2
    WHERE t2.mysequence = (
        SELECT MAX(t3.mysequence)
        FROM test t3
        WHERE t3.mysequence <= t1.mysequence
        AND mynumber IS NOT NULL
       )
) AS CALCULATED
FROM test t1;

回答by Cyrus Christ

I know it is a very old forum, but I came across this while troubleshooting my problem :) just realised that the other guys have given bit complex solution to the above problem. Please see my solution below:

我知道这是一个非常古老的论坛,但是我在解决我的问题时遇到了这个问题:) 刚刚意识到其他人对上述问题给出了一些复杂的解决方案。请在下面查看我的解决方案:

DECLARE @A TABLE(ID INT, Val INT)

INSERT INTO @A(ID,Val) SELECT 1, 3
INSERT INTO @A(ID,Val) SELECT 2, NULL
INSERT INTO @A(ID,Val) SELECT 3, 5
INSERT INTO @A(ID,Val) SELECT 4, NULL
INSERT INTO @A(ID,Val) SELECT 5, NULL
INSERT INTO @A(ID,Val) SELECT 6, 2

UPDATE D
    SET D.VAL = E.VAL
    FROM (SELECT A.ID C_ID, MAX(B.ID) P_ID
          FROM  @A AS A
           JOIN @A AS B ON A.ID > B.ID
          WHERE A.Val IS NULL
            AND B.Val IS NOT NULL
          GROUP BY A.ID) AS C
    JOIN @A AS D ON C.C_ID = D.ID
    JOIN @A AS E ON C.P_ID = E.ID

SELECT * FROM @A

Hope this may help someone:)

希望这可以帮助某人:)

回答by PieCharmed

If you're looking for a solution for Redshift, this will work with the frame clause:

如果您正在寻找 Redshift 的解决方案,这将适用于 frame 子句:

SELECT date, 
       last_value(columnName ignore nulls) 
                   over (order by date
                         rows between unbounded preceding and current row) as columnName 
 from tbl

回答by George

This is the solution for MS Access.

这是 MS Access 的解决方案。

The example table is called tab, with fields idand val.

示例表被称为tab,具有字段idval

SELECT (SELECT last(val)
          FROM tab AS temp
          WHERE tab.id >= temp.id AND temp.val IS NOT NULL) AS val2, *
  FROM tab;

回答by van

First of all, do you really need to store the values? You may just use the view that does the job:

首先,你真的需要存储这些值吗?您可以只使用完成这项工作的视图:

SELECT  t."date",
        x."number" AS "number"
FROM    @Table t
JOIN    @Table x
    ON  x."date" = (SELECT  TOP 1 z."date"
                    FROM    @Table z
                    WHERE   z."date" <= t."date"
                        AND z."number" IS NOT NULL
                    ORDER BY z."date" DESC)

If you really do have the ID ("date")column and it is a primary key (clustered), then this query should be pretty fast. But check the query plan: it might be better to have a cover index including the Valcolumn as well.

如果您确实有该ID ("date")列并且它是主键(集群),那么此查询应该非常快。但是检查查询计划:最好有一个包含该Val列的覆盖索引。

Also if you do not like procedures when you can avoid them, you can also use similar query for UPDATE:

此外,如果您不喜欢可以避免它们的过程,您也可以使用类似的查询UPDATE

UPDATE  t
SET     t."number" = x."number"
FROM    @Table t
JOIN    @Table x
    ON  x."date" = (SELECT  TOP 1 z."date"
                    FROM    @Table z
                    WHERE   z."date" < t."date" --//@note: < and not <= here, as = not required
                        AND z."number" IS NOT NULL
                    ORDER BY z."date" DESC)
WHERE   t."number" IS NULL

NOTE: the code must works on "SQL Server".

注意:代码必须适用于“SQL Server”。

回答by OMG Ponies

UPDATE TABLE
   SET number = (SELECT MAX(t.number)
                  FROM TABLE t
                 WHERE t.number IS NOT NULL
                   AND t.date < date)
 WHERE number IS NULL