使用 PATINDEX 在 T-SQL 中查找变长模式

Question

提问by Rich

I'm looking to pull floats out of some varchars, using PATINDEX() to spot them. I know in each varchar string, I'm only interested in the first float that exists, but they might have different lengths.

我希望从一些 varchars 中提取浮点数，使用 PATINDEX() 来发现它们。我知道在每个 varchar 字符串中，我只对存在的第一个浮点数感兴趣，但它们的长度可能不同。

e.g.

例如

'some text 456.09 other text'
'even more text 98273.453 la la la'

I would normally match these with a regex

我通常会将这些与正则表达式匹配

  "[0-9]+[.][0-9]+"

However, I can't find an equivalent for the + operator, which PATINDEX accepts. So they would need to be matched (respectively) with:

但是，我找不到 PATINDEX 接受的 + 运算符的等效项。因此，它们需要（分别）与：

'[0-9][0-9][0-9].[0-9][0-9]' and '[0-9][0-9][0-9][0-9][0-9].[0-9][0-9][0-9]'

Is there any way to match both of these example varchars with one single valid PATINDEX pattern?

有没有办法将这两个示例 varchars 与一个有效的 PATINDEX 模式匹配？

Answer 1

采纳答案by Klas Lindb?ck

PATINDEX is not powerful enough to do that. You should use regular expressions.

PATINDEX 的功能不足以做到这一点。您应该使用正则表达式。

SQL Server has Regular expression support since SQL Server 2005.

SQL Server 自 SQL Server 2005 起就支持正则表达式。

Answer 2

回答by George Mastros

I blogged about this a while ago. Extracting numbers with SQL server

不久前我在博客上写过这个。使用 SQL 服务器提取数字

Declare @Temp Table(Data VarChar(100))

Insert Into @Temp Values('some text 456.09 other text')
Insert Into @Temp Values('even more text 98273.453 la la la')
Insert Into @Temp Values('There are no numbers in this one')

Select Left(
             SubString(Data, PatIndex('%[0-9.-]%', Data), 8000),
             PatIndex('%[^0-9.-]%', SubString(Data, PatIndex('%[0-9.-]%', Data), 8000) + 'X')-1)
From   @Temp

Answer 3

回答by Ben

Wildcards.

通配符。

SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','some text 456.09 other text')
SELECT PATINDEX('%[0-9]%[0-9].[0-9]%[0-9]%','even more text 98273.453 la la la')

Answer 4

回答by paparazzo

Yes you need to link to the clr to get regex support. But if PATINDEX does not do what you need then regex was designed exactly for that.

是的，您需要链接到 clr 才能获得正则表达式支持。但是，如果 PATINDEX 不能满足您的需求，那么正则表达式正是为此而设计的。

http://msdn.microsoft.com/en-us/magazine/cc163473.aspx

Answer 5

回答by David Brabant

Should be checked for robustness (what if you only have an int, for example), but this is just to put you on a track:

应该检查稳健性（例如，如果您只有一个 int 会怎样），但这只是为了让您走上正轨：

if exists (select routine_name from information_schema.routines where routine_name = 'GetFirstFloat')
    drop function GetFirstFloat
go

create function GetFirstFloat (@string varchar(max))
returns float
as
begin
    declare @float varchar(max)
    declare @pos int

    select @pos = patindex('%[0-9]%', @string)
    select @float = ''

    while isnumeric(substring(@string, @pos, 1)) = 1
    begin
        select @float = @float + substring(@string, @pos, 1)
        select @pos = @pos + 1
    end

    return cast(@float as float)
end
go


select dbo.GetFirstFloat('this is a string containing pi 3.14159216 and another non float 3 followed by a new fload 5.41 and that''s it')
select dbo.GetFirstFloat('this is a string with no float')
select dbo.GetFirstFloat('this is another string with an int 3')

Answer 6

回答by Peter

Given that the pattern is going to be varied in length, you're not going to have a rough time getting this to work with PATINDEX. There is another post that I wrote, which I've modified to accomplish what you're trying to do here. Will this work for you?

考虑到模式的长度会有所不同，您将不会有困难的时间让它与 PATINDEX 一起使用。我写了另一篇文章，我对其进行了修改以完成您在此处尝试执行的操作。这对你有用吗？

CREATE TABLE #nums (n INT)
DECLARE @i INT 
SET @i = 1
WHILE @i < 8000 
BEGIN
    INSERT #nums VALUES(@i)
    SET @i = @i + 1
END

CREATE TABLE #tmp (
  id INT IDENTITY(1,1) not null,
  words VARCHAR(MAX) null
)

INSERT INTO #tmp
VALUES('I''m looking for a number, regardless of length, even 23.258 long'),('Maybe even pi which roughly 3.14159265358,'),('or possibly something else that isn''t a number')

UPDATE #tmp SET words = REPLACE(words, ',',' ')

;WITH CTE AS (SELECT ROW_NUMBER() OVER (ORDER BY ID) AS rownum, ID, NULLIF(SUBSTRING(' ' + words + ' ' , n , CHARINDEX(' ' , ' ' + words + ' ' , n) - n) , '') AS word
    FROM #nums, #tmp
    WHERE ID <= LEN(' ' + words + ' ') AND SUBSTRING(' ' + words + ' ' , n - 1, 1) = ' ' 
    AND CHARINDEX(' ' , ' ' + words + ' ' , n) - n > 0),
    ids AS (SELECT ID, MIN(rownum) AS rownum FROM CTE WHERE ISNUMERIC(word) = 1 GROUP BY id)
SELECT CTE.rownum, cte.id, cte.word
FROM CTE, ids WHERE cte.id = ids.id AND cte.rownum = ids.rownum

The explanation and origin of the code is covered in more detail in the origional post

代码的解释和来源在原始帖子中有更详细的介绍

使用 PATINDEX 在 T-SQL 中查找变长模式

提问by Rich

采纳答案by Klas Lindb?ck

回答by George Mastros

回答by Ben

回答by paparazzo

回答by David Brabant

回答by Peter

相关推荐

最近更新

标签

使用 PATINDEX 在 T-SQL 中查找变长模式

提问by Rich

采纳答案by Klas Lindb?ck

回答by George Mastros

回答by Ben

回答by paparazzo

回答by David Brabant

回答by Peter

相关推荐

SQL 获取明天的日期

如何创建带参数的 SQL 函数？

SQL 如何只选择满足条件的第一行？

SQL 解码空值

相关推荐

最近更新

标签