在 SQL Server 中修剪前导零的更好技术?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/662383/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 01:28:47  来源:igfitidea点击:

Better techniques for trimming leading zeros in SQL Server?

sqlsql-serversql-server-2005tsqlstring

提问by Cade Roux

I've been using thisfor some time:

我一直在使用一段时间:

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col), LEN(str_col))

However recently, I've found a problem with columns with all "0" characters like '00000000' because it never finds a non-"0" character to match.

但是最近,我发现所有“0”字符(如“00000000”)的列存在问题,因为它从未找到要匹配的非“0”字符。

An alternative technique I've seen is to use TRIM:

我见过的另一种技术是使用TRIM

REPLACE(LTRIM(REPLACE(str_col, '0', ' ')), ' ', '0')

This has a problem if there are embedded spaces, because they will be turned into "0"s when the spaces are turned back into "0"s.

如果有嵌入的空格,这会出现问题,因为当空格变回“0”时,它们会变成“0”。

I'm trying to avoid a scalar UDF. I've found a lot of performance problems with UDFs in SQL Server 2005.

我试图避免使用标量 UDF。我发现 SQL Server 2005 中的 UDF 存在很多性能问题。

回答by Arvo

SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col))

回答by Quassnoi

Why don't you just cast the value to INTEGERand then back to VARCHAR?

为什么不直接将值转换为INTEGER然后返回到VARCHAR

SELECT  CAST(CAST('000000000' AS INTEGER) AS VARCHAR)

--------
       0

回答by MikeTeeVee

Other answers here to not take into consideration if you have all-zero's (or even a single zero).
Some always default an empty string to zero, which is wrong when it is supposed to remain blank.
Re-read the original question. This answers what the Questioner wants.

如果您有全零(甚至是单个零),则此处不考虑其他答案。
有些总是默认一个空字符串为零,当它应该保持空白时这是错误的。
重新阅读原始问题。这回答了发问者想要什么。

Solution #1:

解决方案#1:

--This example uses both Leading and Trailing zero's.
--Avoid losing those Trailing zero's and converting embedded spaces into more zeros.
--I added a non-whitespace character ("_") to retain trailing zero's after calling Replace().
--Simply remove the RTrim() function call if you want to preserve trailing spaces.
--If you treat zero's and empty-strings as the same thing for your application,
--  then you may skip the Case-Statement entirely and just use CN.CleanNumber .
DECLARE @WackadooNumber VarChar(50) = ' 0 0123ABC D0 '--'000'--
SELECT WN.WackadooNumber, CN.CleanNumber,
       (CASE WHEN WN.WackadooNumber LIKE '%0%' AND CN.CleanNumber = '' THEN '0' ELSE CN.CleanNumber END)[AllowZero]
 FROM (SELECT @WackadooNumber[WackadooNumber]) AS WN
 OUTER APPLY (SELECT RTRIM(RIGHT(WN.WackadooNumber, LEN(LTRIM(REPLACE(WN.WackadooNumber + '_', '0', ' '))) - 1))[CleanNumber]) AS CN
--Result: "123ABC D0"

Solution #2 (with sample data):

解决方案#2(带有示例数据):

SELECT O.Type, O.Value, Parsed.Value[WrongValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.Value) = 0--And the trimmed length is zero.
             THEN '0' ELSE Parsed.Value END)[FinalValue],
       (CASE WHEN CHARINDEX('0', T.Value)  > 0--If there's at least one zero.
              AND LEN(Parsed.TrimmedValue) = 0--And the trimmed length is zero.
             THEN '0' ELSE LTRIM(RTRIM(Parsed.TrimmedValue)) END)[FinalTrimmedValue]
  FROM 
  (
    VALUES ('Null', NULL), ('EmptyString', ''),
           ('Zero', '0'), ('Zero', '0000'), ('Zero', '000.000'),
           ('Spaces', '    0   A B C '), ('Number', '000123'),
           ('AlphaNum', '000ABC123'), ('NoZero', 'NoZerosHere')
  ) AS O(Type, Value)--O is for Original.
  CROSS APPLY
  ( --This Step is Optional.  Use if you also want to remove leading spaces.
    SELECT LTRIM(RTRIM(O.Value))[Value]
  ) AS T--T is for Trimmed.
  CROSS APPLY
  ( --From @CadeRoux's Post.
    SELECT SUBSTRING(O.Value, PATINDEX('%[^0]%', O.Value + '.'), LEN(O.Value))[Value],
           SUBSTRING(T.Value, PATINDEX('%[^0]%', T.Value + '.'), LEN(T.Value))[TrimmedValue]
  ) AS Parsed

Results:

结果:

MikeTeeVee_SQL_Server_Remove_Leading_Zeros

MikeTeeVee_SQL_Server_Remove_Leading_Zeros

Summary:

概括:

You could use what I have above for a one-off removal of leading-zero's.
If you plan on reusing it a lot, then place it in an Inline-Table-Valued-Function (ITVF).
Your concerns about performance problems with UDF's is understandable.
However, this problem only applies to All-Scalar-Functions and Multi-Statement-Table-Functions.
Using ITVF's is perfectly fine.

I have the same problem with our 3rd-Party database.
With Alpha-Numeric fields many are entered in without the leading spaces, dang humans!
This makes joins impossible without cleaning up the missing leading-zeros.

您可以使用我上面的内容一次性删除前导零。
如果您打算多次重用它,请将其放在内联表值函数 (ITVF) 中。
您对 UDF 性能问题的担忧是可以理解的。
然而,这个问题只适用于 All-Scalar-Functions 和 Multi-Statement-Table-Functions。
使用 ITVF 完全没问题。

我的 3rd-Party 数据库也有同样的问题。
使用字母数字字段,许多人在没有前导空格的情况下输入,该死的人类!
这使得在不清除丢失的前导零的情况下不可能进行连接。

Conclusion:

结论:

Instead of removing the leading-zeros, you may want to consider just padding your trimmed-values with leading-zeros when you do your joins.
Better yet, clean up your data in the table by adding leading zeros, then rebuilding your indexes.
I think this would be WAY faster and less complex.

与其删除前导零,不如考虑在进行连接时只用前导零填充修剪后的值。
更好的是,通过添加前导零来清理表中的数据,然后重建索引。
我认为这会更快,更简单。

SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF(' 0A10  ', ''))), 10)--0000000A10
SELECT RIGHT('0000000000' + LTRIM(RTRIM(NULLIF('', ''))), 10)--NULL --When Blank.

回答by Joel Coehoorn

Instead of a space replace the 0's with a 'rare' whitespace character that shouldn't normally be in the column's text. A line feed is probably good enough for a column like this. Then you can LTrim normally and replace the special character with 0's again.

用“稀有”空白字符代替空格代替 0,这些空白字符通常不应出现在列的文本中。对于这样的列,换行可能就足够了。然后你可以正常LTrim并再次用0替换特殊字符。

回答by Scott

The following will return '0' if the string consists entirely of zeros:

如果字符串完全由零组成,以下将返回“0”:

CASE WHEN SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) = '' THEN '0' ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col+'.'), LEN(str_col)) END AS str_col

回答by user2600313

This makes a nice Function....

这使一个很好的功能....

DROP FUNCTION [dbo].[FN_StripLeading]
GO
CREATE FUNCTION [dbo].[FN_StripLeading] (@string VarChar(128), @stripChar VarChar(1))
RETURNS VarChar(128)
AS
BEGIN
-- http://stackoverflow.com/questions/662383/better-techniques-for-trimming-leading-zeros-in-sql-server
    DECLARE @retVal VarChar(128),
            @pattern varChar(10)
    SELECT @pattern = '%[^'+@stripChar+']%'
    SELECT @retVal = CASE WHEN SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) = '' THEN @stripChar ELSE SUBSTRING(@string, PATINDEX(@pattern, @string+'.'), LEN(@string)) END
    RETURN (@retVal)
END
GO
GRANT EXECUTE ON [dbo].[FN_StripLeading] TO PUBLIC

回答by Brisbe

My version of this is an adaptation of Arvo's work, with a little more added on to ensure two other cases.

我的这个版本是对 Arvo 工作的改编,增加了一点以确保其他两种情况。

1) If we have all 0s, we should return the digit 0.

1) 如果我们全是 0,我们应该返回数字 0。

2) If we have a blank, we should still return a blank character.

2)如果我们有一个空格,我们仍然应该返回一个空格字符。

CASE 
    WHEN PATINDEX('%[^0]%', str_col + '.') > LEN(str_col) THEN RIGHT(str_col, 1) 
    ELSE SUBSTRING(str_col, PATINDEX('%[^0]%', str_col + '.'), LEN(str_col))
 END

回答by tichra

cast(value as int) will always work if string is a number

如果字符串是数字,则 cast(value as int) 将始终有效

回答by JJFord3

If you are using Snowflake SQL, might use this:

如果您使用的是 Snowflake SQL,可以使用这个:

ltrim(str_col,'0')

The ltrim function removes all instances of the designated set of characters from the left side.

ltrim 函数从左侧删除指定字符集的所有实例。

So ltrim(str_col,'0') on '00000008A' would return '8A'

所以 ltrim(str_col,'0') on '00000008A' 将返回 '8A'

And rtrim(str_col,'0.') on '$125.00' would return '$125'

'$1​​25.00' 上的 rtrim(str_col,'0.') 将返回 '$125'

回答by Lisandro

  SUBSTRING(str_col, IIF(LEN(str_col) > 0, PATINDEX('%[^0]%', LEFT(str_col, LEN(str_col) - 1) + '.'), 0), LEN(str_col))

Works fine even with '0', '00' and so on.

即使使用“0”、“00”等也能正常工作。