SQL Server:如何从字段中删除标点符号?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1818430/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:35:45  来源:igfitidea点击:

SQL Server: How do you remove punctuation from a field?

sqlsql-servertsqlreplacepunctuation

提问by Ev.

Any one know a good way to remove punctuation from a field in SQL Server?

有人知道从 SQL Server 中的字段中删除标点符号的好方法吗?

I'm thinking

我在想

UPDATE tblMyTable SET FieldName = REPLACE(REPLACE(REPLACE(FieldName,',',''),'.',''),'''' ,'')

but it seems a bit tedious when I intend on removing a large number of different characters for example: !@#$%^&*()<>:"

但是当我打算删除大量不同的字符时似乎有点乏味,例如:!@#$%^&*()<>:"

Thanks in advance

提前致谢

回答by Tim C

Ideally, you would do this in an application language such as C# + LINQ as mentioned above.

理想情况下,您将使用应用程序语言(如上述 C# + LINQ)执行此操作。

If you wanted to do it purely in T-SQL though, one way make things neater would be to firstly create a table that held all the punctuation you wanted to removed.

但是,如果您想纯粹在 T-SQL 中执行此操作,一种使事情更整洁的方法是首先创建一个包含您要删除的所有标点符号的表。

CREATE TABLE Punctuation 
(
    Symbol VARCHAR(1) NOT NULL
)

INSERT INTO Punctuation (Symbol) VALUES('''')
INSERT INTO Punctuation (Symbol) VALUES('-')
INSERT INTO Punctuation (Symbol) VALUES('.')

Next, you could create a function in SQL to remove all the punctuation symbols from an input string.

接下来,您可以在 SQL 中创建一个函数来从输入字符串中删除所有标点符号。

CREATE FUNCTION dbo.fn_RemovePunctuation
(
    @InputString VARCHAR(500)
)
RETURNS VARCHAR(500)
AS
BEGIN
    SELECT
        @InputString = REPLACE(@InputString, P.Symbol, '')
    FROM 
        Punctuation P

    RETURN @InputString
END
GO

Then you can just call the function in your UPDATE statement

然后你可以在你的 UPDATE 语句中调用该函数

UPDATE tblMyTable SET FieldName = dbo.fn_RemovePunctuation(FieldName)

回答by Ken Mc

I wanted to avoid creating a table and wanted to remove everything except letters and digits.

我想避免创建表格并想删除除字母和数字以外的所有内容。

DECLARE @p int
DECLARE @Result Varchar(250)
DECLARE @BadChars Varchar(12)
SELECT @BadChars = '%[^a-z0-9]%'
-- to leave spaces - SELECT @BadChars = '%[^a-z0-9] %'

SET @Result = @InStr

SET @P =PatIndex(@BadChars,@Result)
WHILE @p > 0 BEGIN
    SELECT @Result = Left(@Result,@p-1) + Substring(@Result,@p+1,250)
    SET @P =PatIndex(@BadChars,@Result)
    END

回答by priyanka.sarkar

I am proposing 2 solutions

我提出了 2 个解决方案

Solution 1: Make a noise table and replace the noises with blank spaces

解决方案1:制作噪音表并用空格替换噪音

e.g.

例如

DECLARE @String VARCHAR(MAX)
DECLARE @Noise TABLE(Noise VARCHAR(100),ReplaceChars VARCHAR(10))
SET @String = 'hello! how * > are % u (: . I am ok :). Oh nice!'

INSERT INTO @Noise(Noise,ReplaceChars)
SELECT '!',SPACE(1) UNION ALL SELECT '@',SPACE(1) UNION ALL
SELECT '#',SPACE(1) UNION ALL SELECT '$',SPACE(1) UNION ALL
SELECT '%',SPACE(1) UNION ALL SELECT '^',SPACE(1) UNION ALL
SELECT '&',SPACE(1) UNION ALL SELECT '*',SPACE(1) UNION ALL
SELECT '(',SPACE(1) UNION ALL SELECT ')',SPACE(1) UNION ALL
SELECT '{',SPACE(1) UNION ALL SELECT '}',SPACE(1) UNION ALL
SELECT '<',SPACE(1) UNION ALL SELECT '>',SPACE(1) UNION ALL
SELECT ':',SPACE(1)

SELECT @String = REPLACE(@String, Noise, ReplaceChars) FROM @Noise
SELECT @String Data

Solution 2: With a number table

解决方案2:使用数字表

DECLARE @String VARCHAR(MAX)
SET @String = 'hello! & how * > are % u (: . I am ok :). Oh nice!'

;with numbercte as
(
 select 1 as rn
 union all
 select rn+1 from numbercte where rn<LEN(@String)
)
select REPLACE(FilteredData,'&#x20;',SPACE(1)) Data from 
(select SUBSTRING(@String,rn,1) 
from numbercte  
where SUBSTRING(@String,rn,1) not in('!','*','>','<','%','(',')',':','!','&','@','#','$')

for xml path(''))X(FilteredData)

Output(Both the cases)

输出(两种情况)

Data

数据

hello  how   are  u  . I am ok . Oh nice

Note- I have just put some of the noises. You may need to put the noises that u need.

注意 - 我刚刚加入了一些噪音。您可能需要放置您需要的噪音。

Hope this helps

希望这可以帮助

回答by Fenton

You can use regular expressions in SQL Server - here is an article based on SQL 2005:

您可以在 SQL Server 中使用正则表达式——这是一篇基于 SQL 2005 的文章:

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

回答by Jeff Orange

I took Ken MC's solution and made it into an function which can replace all punctuation with a given string:

我采用了 Ken MC 的解决方案,并将其制作成一个函数,可以用给定的字符串替换所有标点符号:

----------------------------------------------------------------------------------------------------------------
-- This function replaces all punctuation in the given string with the "replaceWith" string
----------------------------------------------------------------------------------------------------------------
IF object_id('[dbo].[fnReplacePunctuation]') IS NOT NULL
BEGIN
    DROP FUNCTION [dbo].[fnReplacePunctuation];
END;
GO
CREATE FUNCTION [dbo].[fnReplacePunctuation] (@string NVARCHAR(MAX), @replaceWith NVARCHAR(max))
RETURNS NVARCHAR(MAX)
BEGIN
    DECLARE @Result Varchar(max) = @string;
    DECLARE @BadChars Varchar(12) = '%[^a-z0-9]%'; -- to leave spaces - SELECT @BadChars = '%[^a-z0-9] %'
    DECLARE @p int = PatIndex(@BadChars,@Result);
    DECLARE @searchFrom INT;
    DECLARE @indexOfPunct INT = @p;

    WHILE @indexOfPunct > 0 BEGIN
      SET @searchFrom = LEN(@Result) - @p;
      SET @Result = Left(@Result, @p-1) + @replaceWith + Substring(@Result, @p+1,LEN(@Result));
      SET @IndexOfPunct = PatIndex(@BadChars, substring(@Result, (LEN(@Result) - @SearchFrom)+1, LEN(@Result)));
      SET @p = (LEN(@Result) - @searchFrom) + @indexOfPunct;
    END
    RETURN @Result;
END;
GO
-- example:
SELECT dbo.fnReplacePunctuation('This is, only, a tést-really..', '');

Output:

输出:

Thisisonlyatéstreally

回答by gbn

I'd wrap it in a simple scalar UDF so all string cleaning is in one place if it's needed again.

我会将它包装在一个简单的标量 UDF 中,因此如果再次需要,所有字符串清理都在一个地方。

Then you can use it on INSERT too...

然后你也可以在 INSERT 上使用它......

回答by Tiberiu Ana

If it's a one-off thing, I would use a C# + LINQ snippet in LINQPadto do the job with regular expressions.

如果这是一次性的事情,我会在LINQPad 中使用 C# + LINQ 代码段来使用正则表达式来完成这项工作。

It is quick and easy and you don't have to go through the process of setting up a CLR stored procedure and then cleaning up after yourself.

它既快速又简单,您无需经历设置 CLR 存储过程然后自行清理的过程。

回答by mahalie

Can't you use PATINDEX to only include NUMBERS and LETTERS instead of trying to guess what punctuation might be in the field? (Not trying to be snarky, if I had the code ready, I'd share it...but this is what I'm looking for).

难道您不能使用 PATINDEX 只包含数字和字母,而不是尝试猜测字段中的标点符号吗?(不要刻薄,如果我准备好了代码,我会分享它......但这就是我正在寻找的)。

Seems like you need to create a custom function in order to avoid a giant list of replace functions in your queries - here's a good example:

似乎您需要创建一个自定义函数以避免查询中出现大量替换函数 - 这是一个很好的例子:

http://www.codeproject.com/KB/database/SQLPhoneNumbersPart_2.aspx?display=Print

http://www.codeproject.com/KB/database/SQLPhoneNumbersPart_2.aspx?display=Print