使用 SQL PATINDEX 提取字符串,不同大小的子字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/25811534/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 02:37:30  来源:igfitidea点击:

Extracting a string using SQL PATINDEX, substring of varying sizes

sqlregexsql-server-2012substring

提问by Kyle

I'm trying to extract ###x###, ###x##, and sometimes #x#. Sometimes there may be a space between the numbers and the x. Essentially, I may run into strings like

我试图提取###x###, ###x##, 有时#x#. 有时数字和 x 之间可能有一个空格。本质上,我可能会遇到像这样的字符串

  • 720x60
  • 720x600
  • 720 x 60
  • 720_x_60
  • 1x1
  • 720x60
  • 720x600
  • 720 × 60
  • 720_x_60
  • 1x1

I use PATINDEX()to find the first occurrence of the pattern '%[0-9]%x%[0-9]%'. So far so good. Then I use PATINDEX()to find the first occurence of a non-digit string after that. This is where I have trouble. I get results as in the screenshot. Code is also below.

PATINDEX()用来查找模式的第一次出现'%[0-9]%x%[0-9]%'。到现在为止还挺好。然后我PATINDEX()用来找到之后第一次出现的非数字字符串。这是我遇到麻烦的地方。我得到的结果如截图所示。代码也在下面。

SELECT *
    ,CASE WHEN StartInt > 0
        THEN SUBSTRING(Placement, StartInt, SizeLength) ELSE NULL END AS PlacementSize
FROM
(SELECT Placement
    --find the first occurrence of #*x*#
    ,PATINDEX('%[0-9]%x%[0-9]%',Placement) AS StartInt

    --find the first non-digit after that
    ,PATINDEX(
        '%[^0-9]%'
        ,RIGHT(
            Placement + '_' --this underscore adds at least one non-digit to find
            ,LEN(Placement)
                -
            PATINDEX('%[0-9]%x%[0-9]%',Placement) - 5
            )
        ) + 6 AS SizeLength
FROM [Staging].[Client].[A01_FY14_Reporting_staging]
WHERE [Date] > '2014-07-01') AS a

Results:

结果:

enter image description here

在此处输入图片说明

采纳答案by Jaaz Cole

If you're dealing with a pair of numeric values, but are also dealing with dirty data, and lack the power of Regex, here's what you can do in TSQL.

如果您正在处理一对数值,但也在处理脏数据,并且缺乏 Regex 的功能,那么您可以在 TSQL 中执行以下操作。

Essentially, it looks like you're wanting to break the string in half at 'x', then whittle down the outputs until you have numeric only values. Using a set of derived tables, this becomes relatively easy (and not as hard to read)

从本质上讲,您似乎想在 'x' 处将字符串分成两半,然后减少输出,直到只有数字值。使用一组派生表,这变得相对容易(并且不难阅读)

declare @placements table (Placement varchar(10))
insert into @placements values 
('720x60'),
('720x600'),
('720 x 60'),
('720_x_60'),
('1x1')

SELECT LEFT(LeftOfX,PATINDEX('%[^0-9]%',LeftOfX) - 1) + 'x' + RIGHT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', RightOfX) + 1)
FROM (
    SELECT RIGHT(LeftOfX, LEN(LeftOfX) - PATINDEX('%[0-9]%', LeftOfX) + 1) AS LeftOfX, LEFT(RightOfX, LEN(RightOfX) - PATINDEX('%[0-9]%', REVERSE(RightOfX)) + 1) AS RightOfX
    FROM (
        SELECT LEFT(p.Placement,x) AS LeftOfX, RIGHT(p.Placement,LEN(p.Placement) - x + 1) AS RightOfX
        FROM (
            SELECT
                  p.Placement
                , CHARINDEX('x',p.Placement) AS x
            FROM @placements p
            ) p
        ) p
    ) p

Here's the SQLFiddle example.

这是SQLFiddle 示例

First, select your placement, the location of your 'x' in Placement, and other columns you want from the table. Pass the other columns up through the derived tables.

首先,选择您的展示位置、“展示位置”中“x”的位置以及您想要从表格中选择的其他列。通过派生表向上传递其他列。

Next, Split the string into Left and Right.

接下来,将字符串拆分为 Left 和 Right。

Process left and right in two more queries, the first to take the right of results starting at the numeric portion, then the left of the results ending at the non-numeric portion.

在另外两个查询中处理 left 和 right ,第一个从数字部分开始获取结果的右侧,然后在非数字部分结束的结果左侧。

EDIT: Fixed the outputs, both numbers now selected.

编辑:修复了输出,现在选择了两个数字。