SQL 如何拆分字符串以便访问项目 x?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2647/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 23:04:20  来源:igfitidea点击:

How do I split a string so I can access item x?

sqlsql-servertsqlsplit

提问by GateKiller

Using SQL Server, how do I split a string so I can access item x?

使用 SQL Server,如何拆分字符串以便访问项目 x?

Take a string "Hello John Smith". How can I split the string by space and access the item at index 1 which should return "John"?

取一个字符串“你好约翰史密斯”。如何按空格拆分字符串并访问应返回“John”的索引 1 处的项目?

采纳答案by Jonesinator

You may find the solution in SQL User Defined Function to Parse a Delimited Stringhelpful (from The Code Project).

您可能会发现SQL User Defined Function to Parse a Delimited String 中的解决方案很有帮助(来自代码项目)。

You can use this simple logic:

您可以使用这个简单的逻辑:

Declare @products varchar(200) = '1|20|3|343|44|6|8765'
Declare @individual varchar(20) = null

WHILE LEN(@products) > 0
BEGIN
    IF PATINDEX('%|%', @products) > 0
    BEGIN
        SET @individual = SUBSTRING(@products,
                                    0,
                                    PATINDEX('%|%', @products))
        SELECT @individual

        SET @products = SUBSTRING(@products,
                                  LEN(@individual + '|') + 1,
                                  LEN(@products))
    END
    ELSE
    BEGIN
        SET @individual = @products
        SET @products = NULL
        SELECT @individual
    END
END

回答by Nathan Bedford

I don't believe SQL Server has a built-in split function, so other than a UDF, the only other answer I know is to hiHyman the PARSENAME function:

我不相信 SQL Server 有内置的拆分函数,所以除了 UDF 之外,我知道的唯一其他答案是劫持 PARSENAME 函数:

SELECT PARSENAME(REPLACE('Hello John Smith', ' ', '.'), 2) 

PARSENAME takes a string and splits it on the period character. It takes a number as its second argument, and that number specifies which segment of the string to return (working from back to front).

PARSENAME 接受一个字符串并将其拆分为句点字符。它接受一个数字作为其第二个参数,该数字指定要返回的字符串段(从后到前)。

SELECT PARSENAME(REPLACE('Hello John Smith', ' ', '.'), 3)  --return Hello

Obvious problem is when the string already contains a period. I still think using a UDF is the best way...any other suggestions?

明显的问题是字符串已经包含一个句点。我仍然认为使用 UDF 是最好的方法……还有其他建议吗?

回答by vzczc

First, create a function (using CTE, common table expression does away with the need for a temp table)

首先,创建一个函数(使用CTE,公共表表达式不需要临时表)

 create function dbo.SplitString 
    (
        @str nvarchar(4000), 
        @separator char(1)
    )
    returns table
    AS
    return (
        with tokens(p, a, b) AS (
            select 
                1, 
                1, 
                charindex(@separator, @str)
            union all
            select
                p + 1, 
                b + 1, 
                charindex(@separator, @str, b + 1)
            from tokens
            where b > 0
        )
        select
            p-1 zeroBasedOccurance,
            substring(
                @str, 
                a, 
                case when b > 0 then b-a ELSE 4000 end) 
            AS s
        from tokens
      )
    GO

Then, use it as any table (or modify it to fit within your existing stored proc) like this.

然后,像这样将它用作任何表(或修改它以适应您现有的存储过程)。

select s 
from dbo.SplitString('Hello John Smith', ' ')
where zeroBasedOccurance=1

Update

更新

Previous version would fail for input string longer than 4000 chars. This version takes care of the limitation:

对于超过 4000 个字符的输入字符串,以前的版本会失败。此版本解决了以下限制:

create function dbo.SplitString 
(
    @str nvarchar(max), 
    @separator char(1)
)
returns table
AS
return (
with tokens(p, a, b) AS (
    select 
        cast(1 as bigint), 
        cast(1 as bigint), 
        charindex(@separator, @str)
    union all
    select
        p + 1, 
        b + 1, 
        charindex(@separator, @str, b + 1)
    from tokens
    where b > 0
)
select
    p-1 ItemIndex,
    substring(
        @str, 
        a, 
        case when b > 0 then b-a ELSE LEN(@str) end) 
    AS s
from tokens
);

GO

Usage remains the same.

用法保持不变。

回答by Aaron Bertrand

Most of the solutions here use while loops or recursive CTEs. A set-based approach will be superior, I promise, if you can use a delimiter other than a space:

这里的大多数解决方案都使用 while 循环或递归 CTE。我保证,如果您可以使用空格以外的分隔符,则基于集合的方法会更好:

CREATE FUNCTION [dbo].[SplitString]
    (
        @List NVARCHAR(MAX),
        @Delim VARCHAR(255)
    )
    RETURNS TABLE
    AS
        RETURN ( SELECT [Value], idx = RANK() OVER (ORDER BY n) FROM 
          ( 
            SELECT n = Number, 
              [Value] = LTRIM(RTRIM(SUBSTRING(@List, [Number],
              CHARINDEX(@Delim, @List + @Delim, [Number]) - [Number])))
            FROM (SELECT Number = ROW_NUMBER() OVER (ORDER BY name)
              FROM sys.all_objects) AS x
              WHERE Number <= LEN(@List)
              AND SUBSTRING(@Delim + @List, [Number], LEN(@Delim)) = @Delim
          ) AS y
        );

Sample usage:

示例用法:

SELECT Value FROM dbo.SplitString('foo,bar,blat,foo,splunge',',')
  WHERE idx = 3;

Results:

结果:

----
blat

You could also add the idxyou want as an argument to the function, but I'll leave that as an exercise to the reader.

您也可以将idx您想要的作为参数添加到函数中,但我会将其作为练习留给读者。

You can't do this with justthe native STRING_SPLITfunctionadded in SQL Server 2016, because there is no guarantee that the output will be rendered in the order of the original list. In other words, if you pass in 3,6,1the result will likely be in that order, but it couldbe 1,3,6. I have asked for the community's help in improving the built-in function here:

您不能使用SQL Server 2016 中添加的本机STRING_SPLIT函数来执行此操作,因为无法保证输出将按原始列表的顺序呈现。换句话说,如果你传入3,6,1结果很可能是这个顺序,但它可能1,3,6. 我在这里请求社区帮助改进内置功能:

With enough qualitativefeedback, they may actually consider making some of these enhancements:

有了足够的定性反馈,他们实际上可能会考虑进行以下一些改进:

More on split functions, why (and proof that) while loops and recursive CTEs don't scale, and better alternatives, if splitting strings coming from the application layer:

更多关于拆分函数,为什么(并证明)while 循环和递归 CTE 不能扩展,以及更好的替代方案,如果拆分来自应用程序层的字符串:

On SQL Server 2016 or above, though, you should look at STRING_SPLIT()and STRING_AGG():

但是,在 SQL Server 2016 或更高版本上,您应该查看STRING_SPLIT()STRING_AGG()

回答by Nathan Skerl

You can leverage a Number table to do the string parsing.

您可以利用 Number 表进行字符串解析。

Create a physical numbers table:

创建一个物理数字表:

    create table dbo.Numbers (N int primary key);
    insert into dbo.Numbers
        select top 1000 row_number() over(order by number) from master..spt_values
    go

Create test table with 1000000 rows

创建包含 1000000 行的测试表

    create table #yak (i int identity(1,1) primary key, array varchar(50))

    insert into #yak(array)
        select 'a,b,c' from dbo.Numbers n cross join dbo.Numbers nn
    go

Create the function

创建函数

    create function [dbo].[ufn_ParseArray]
        (   @Input      nvarchar(4000), 
            @Delimiter  char(1) = ',',
            @BaseIdent  int
        )
    returns table as
    return  
        (   select  row_number() over (order by n asc) + (@BaseIdent - 1) [i],
                    substring(@Input, n, charindex(@Delimiter, @Input + @Delimiter, n) - n) s
            from    dbo.Numbers
            where   n <= convert(int, len(@Input)) and
                    substring(@Delimiter + @Input, n, 1) = @Delimiter
        )
    go

Usage (outputs 3mil rows in 40s on my laptop)

用法(在我的笔记本电脑上 40 秒内输出 300 万行)

    select * 
    from #yak 
    cross apply dbo.ufn_ParseArray(array, ',', 1)

cleanup

清理

    drop table dbo.Numbers;
    drop function  [dbo].[ufn_ParseArray]

Performance here is not amazing, but calling a function over a million row table is not the best idea. If performing a string split over many rows I would avoid the function.

这里的性能并不惊人,但调用超过一百万行表的函数并不是最好的主意。如果在多行上执行字符串拆分,我会避免使用该函数。

回答by Shnugo

This question is not about a string split approach, but about how to get the nth element.

这个问题不是关于字符串拆分方法,而是关于如何获取第 n 个元素

All answers here are doing some kind of string splitting using recursion, CTEs, multiple CHARINDEX, REVERSEand PATINDEX, inventing functions, call for CLR methods, number tables, CROSS APPLYs ... Most answers cover many lines of code.

这里所有的答案都使用递归做某种类型的字符串分割的,CTE,多发性CHARINDEXREVERSE并且PATINDEX,发明的功能,呼吁CLR方法,数表,CROSS APPLYS ^ ......大多数的答案涉及多行代码。

But - if you really want nothing more than an approach to get the nth element- this can be done as real one-liner, no UDF, not even a sub-select... And as an extra benefit: type safe

但是 - 如果你真的只想要一种获取第 n 个元素的方法- 这可以作为真正的单行完成,没有 UDF,甚至不是子选择......还有一个额外的好处:类型安全

Get part 2 delimited by a space:

获取由空格分隔的第 2 部分:

DECLARE @input NVARCHAR(100)=N'part1 part2 part3';
SELECT CAST(N'<x>' + REPLACE(@input,N' ',N'</x><x>') + N'</x>' AS XML).value('/x[2]','nvarchar(max)')

Of course you can use variablesfor delimiter and position (use sql:columnto retrieve the position directly from a query's value):

当然,您可以使用变量作为分隔符和位置(用于sql:column直接从查询的值中检索位置):

DECLARE @dlmt NVARCHAR(10)=N' ';
DECLARE @pos INT = 2;
SELECT CAST(N'<x>' + REPLACE(@input,@dlmt,N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)')

If your string might include forbidden characters(especially one among &><), you still can do it this way. Just use FOR XML PATHon your string first to replace all forbidden characters with the fitting escape sequence implicitly.

如果您的字符串可能包含禁止字符(尤其是其中的一个&><),您仍然可以这样做。只需FOR XML PATH先在您的字符串上使用以隐式地用合适的转义序列替换所有禁止的字符。

It's a very special case if - additionally - your delimiter is the semicolon. In this case I replace the delimiter first to '#DLMT#', and replace this to the XML tags finally:

如果 - 另外 -您的分隔符是分号,这是一个非常特殊的情况。在这种情况下,我首先将分隔符替换为“#DLMT#”,最后将其替换为 XML 标记:

SET @input=N'Some <, > and &;Other ??ü@;One more';
SET @dlmt=N';';
SELECT CAST(N'<x>' + REPLACE((SELECT REPLACE(@input,@dlmt,'#DLMT#') AS [*] FOR XML PATH('')),N'#DLMT#',N'</x><x>') + N'</x>' AS XML).value('/x[sql:variable("@pos")][1]','nvarchar(max)');

UPDATE for SQL-Server 2016+

SQL-Server 2016+ 更新

Regretfully the developers forgot to return the part's index with STRING_SPLIT. But, using SQL-Server 2016+, there is JSON_VALUEand OPENJSON.

遗憾的是,开发人员忘记用 返回零件的索引STRING_SPLIT。但是,使用 SQL-Server 2016+,有JSON_VALUEOPENJSON

With JSON_VALUEwe can pass in the position as the index' array.

JSON_VALUE我们可以将该位置作为索引”数组传递。

For OPENJSONthe documentationstates clearly:

对于OPENJSON文件明确规定:

When OPENJSON parses a JSON array, the function returns the indexes of the elements in the JSON text as keys.

当 OPENJSON 解析 JSON 数组时,该函数返回 JSON 文本中元素的索引作为键。

A string like 1,2,3needs nothing more than brackets: [1,2,3].
A string of words like this is an exampleneeds to be ["this","is","an","example"].
These are very easy string operations. Just try it out:

像这样的字符串只1,2,3需要括号:[1,2,3]
像这样的一串单词this is an example需要是["this","is","an","example"]
这些是非常简单的字符串操作。试试看:

DECLARE @str VARCHAR(100)='Hello John Smith';
DECLARE @position INT = 2;

--We can build the json-path '$[1]' using CONCAT
SELECT JSON_VALUE('["' + REPLACE(@str,' ','","') + '"]',CONCAT('$[',@position-1,']'));

--See this for a position safe string-splitter (zero-based):

-- 请参阅此位置安全字符串拆分器(基于零):

SELECT  JsonArray.[key] AS [Position]
       ,JsonArray.[value] AS [Part]
FROM OPENJSON('["' + REPLACE(@str,' ','","') + '"]') JsonArray

In this postI tested various approaches and found, that OPENJSONis really fast. Even much faster than the famous "delimitedSplit8k()" method...

这篇文章中,我测试了各种方法,发现这OPENJSON真的很快。甚至比著名的“delimitedSplit8k()”方法快得多……

UPDATE 2 - Get the values type-safe

更新 2 - 获取类型安全的值

We can use an array within an arraysimply by using doubled [[]]. This allows for a typed WITH-clause:

我们可以简单地通过使用 doubled 来在数组中使用数组[[]]。这允许输入 -WITH子句:

DECLARE  @SomeDelimitedString VARCHAR(100)='part1|1|20190920';

DECLARE @JsonArray NVARCHAR(MAX)=CONCAT('[["',REPLACE(@SomeDelimitedString,'|','","'),'"]]');

SELECT @SomeDelimitedString          AS TheOriginal
      ,@JsonArray                    AS TransformedToJSON
      ,ValuesFromTheArray.*
FROM OPENJSON(@JsonArray)
WITH(TheFirstFragment  VARCHAR(100) '$[0]'
    ,TheSecondFragment INT          '$[1]'
    ,TheThirdFragment  DATE         '$[2]') ValuesFromTheArray

回答by brendan

Here is a UDF which will do it. It will return a table of the delimited values, haven't tried all scenarios on it but your example works fine.

这是一个可以做到这一点的 UDF。它将返回一个分隔值表,尚未尝试所有场景,但您的示例工作正常。


CREATE FUNCTION SplitString 
(
    -- Add the parameters for the function here
    @myString varchar(500),
    @deliminator varchar(10)
)
RETURNS 
@ReturnTable TABLE 
(
    -- Add the column definitions for the TABLE variable here
    [id] [int] IDENTITY(1,1) NOT NULL,
    [part] [varchar](50) NULL
)
AS
BEGIN
        Declare @iSpaces int
        Declare @part varchar(50)

        --initialize spaces
        Select @iSpaces = charindex(@deliminator,@myString,0)
        While @iSpaces > 0

        Begin
            Select @part = substring(@myString,0,charindex(@deliminator,@myString,0))

            Insert Into @ReturnTable(part)
            Select @part

    Select @myString = substring(@mystring,charindex(@deliminator,@myString,0)+ len(@deliminator),len(@myString) - charindex(' ',@myString,0))


            Select @iSpaces = charindex(@deliminator,@myString,0)
        end

        If len(@myString) > 0
            Insert Into @ReturnTable
            Select @myString

    RETURN 
END
GO

You would call it like this:

你会这样称呼它:


Select * From SplitString('Hello John Smith',' ')

Edit: Updated solution to handle delimters with a len>1 as in :

编辑:更新解决方案以处理 len>1 的分隔符,如下所示:


select * From SplitString('Hello**John**Smith','**')

回答by Sivaganesh Tamilvendhan

Here I post a simple way of solution

在这里我发布了一个简单的解决方法

CREATE FUNCTION [dbo].[split](
          @delimited NVARCHAR(MAX),
          @delimiter NVARCHAR(100)
        ) RETURNS @t TABLE (id INT IDENTITY(1,1), val NVARCHAR(MAX))
        AS
        BEGIN
          DECLARE @xml XML
          SET @xml = N'<t>' + REPLACE(@delimited,@delimiter,'</t><t>') + '</t>'

          INSERT INTO @t(val)
          SELECT  r.value('.','varchar(MAX)') as item
          FROM  @xml.nodes('/t') as records(r)
          RETURN
        END


Execute the function like this


像这样执行函数

  select * from dbo.split('Hello John Smith',' ')

回答by Frederic

What about using stringand values()statement?

使用stringvalues()声明怎么样?

DECLARE @str varchar(max)
SET @str = 'Hello John Smith'

DECLARE @separator varchar(max)
SET @separator = ' '

DECLARE @Splited TABLE(id int IDENTITY(1,1), item varchar(max))

SET @str = REPLACE(@str, @separator, '''),(''')
SET @str = 'SELECT * FROM (VALUES(''' + @str + ''')) AS V(A)' 

INSERT INTO @Splited
EXEC(@str)

SELECT * FROM @Splited

Result-set achieved.

取得了成果。

id  item
1   Hello
2   John
3   Smith

回答by Damon Drake

In my opinion you guys are making it way too complicated. Just create a CLR UDF and be done with it.

在我看来,你们把事情搞得太复杂了。只需创建一个 CLR UDF 并完成它。

using System;
using System.Data;
using System.Data.SqlClient;
using System.Data.SqlTypes;
using Microsoft.SqlServer.Server;
using System.Collections.Generic;

public partial class UserDefinedFunctions {
  [SqlFunction]
  public static SqlString SearchString(string Search) {
    List<string> SearchWords = new List<string>();
    foreach (string s in Search.Split(new char[] { ' ' })) {
      if (!s.ToLower().Equals("or") && !s.ToLower().Equals("and")) {
        SearchWords.Add(s);
      }
    }

    return new SqlString(string.Join(" OR ", SearchWords.ToArray()));
  }
};