您可以拆分/分解 MySQL 查询中的字段吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/471914/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:41:34  来源:igfitidea点击:

Can you split/explode a field in a MySQL query?

mysql

提问by nickf

I have to create a report on some student completions. The students each belong to one client. Here are the tables (simplified for this question).

我必须创建一份关于一些学生完成情况的报告。每个学生都属于一个客户。这是表格(针对此问题进行了简化)。

CREATE TABLE  `clients` (
  `clientId` int(10) unsigned NOT NULL auto_increment,
  `clientName` varchar(100) NOT NULL default '',
  `courseNames` varchar(255) NOT NULL default ''
)

The courseNamesfield holds a comma-delimited string of course names, eg "AB01,AB02,AB03"

courseNames字段包含以逗号分隔的课程名称字符串,例如“AB01,AB02,AB03”

CREATE TABLE  `clientenrols` (
  `clientEnrolId` int(10) unsigned NOT NULL auto_increment,
  `studentId` int(10) unsigned NOT NULL default '0',
  `courseId` tinyint(3) unsigned NOT NULL default '0'
)

The courseIdfield here is the index of the course name in the clients.courseNames field. So, if the client's courseNamesare "AB01,AB02,AB03", and the courseIdof the enrolment is 2, then the student is in AB03.

courseId此处的字段是clients.courseNames 字段中课程名称的索引。所以,如果客户courseNames是“AB01,AB02,AB03”,并且courseId注册的是2,那么学生在AB03。

Is there a way that I can do a single select on these tables that includes the course name? Keep in mind that there will be students from different clients (and hence have different course names, not all of which are sequential,eg: "NW01,NW03")

有没有办法可以对这些包含课程名称的表进行一次选择?请记住,会有来自不同客户的学生(因此具有不同的课程名称,并非所有课程名称都是连续的,例如:“NW01,NW03”)

Basically, if I could split that field and return a single element from the resulting array, that would be what I'm looking for. Here's what I mean in magical pseudocode:

基本上,如果我可以拆分该字段并从结果数组中返回单个元素,那将是我正在寻找的。这是我在神奇的伪代码中的意思:

SELECT e.`studentId`, SPLIT(",", c.`courseNames`)[e.`courseId`]
FROM ...

回答by Melchior Blausand

Until now, I wanted to keep those comma separated lists in my SQL db - well aware of all warnings!

到现在为止,我想在我的 SQL 数据库中保留那些逗号分隔的列表 - 非常清楚所有警告!

I kept thinking that they have benefits over lookup tables (which provide a way to a normalized data base). After some days of refusing, I've seen the light:

我一直认为它们比查找表(它提供了一种标准化数据库的方法)有好处。经过几天的拒绝,我看到了曙光

  • Using lookup tables is NOT causing more code than those ugly string operations when using comma separated values in one field.
  • The lookup table allows for native number formats and is thus NOT bigger than those csv fields. It is SMALLER though.
  • The involved string operations are slim in high level language code (SQL and PHP), but expensive compared to using arrays of integers.
  • Databases are not meant to be human readable, and it is mostly stupid to try to stick to structures due to their readability / direct editability, as I did.
  • 在一个字段中使用逗号分隔值时,使用查找表不会导致比那些丑陋的字符串操作更多的代码。
  • 查找表允许使用本机数字格式,因此不会比那些 csv 字段大。虽然它更小。
  • 涉及的字符串操作在高级语言代码(SQL 和 PHP)中很小,但与使用整数数组相比成本很高。
  • 数据库并不意味着人类可读,并且像我一样由于其可读性/直接可编辑性而试图坚持结构通常是愚蠢的。

In short, there is a reason why there is no native SPLIT() function in MySQL.

简而言之,MySQL 中没有原生的 SPLIT() 函数是有原因的。

回答by eithed

Seeing that it's a fairly popular question - the answer is YES.

看到这是一个相当受欢迎的问题 - 答案是肯定的。

For a column columnin table tablecontaining all of your coma separated values:

对于包含所有昏迷分隔值的columntable中的列:

CREATE TEMPORARY TABLE temp (val CHAR(255));
SET @S1 = CONCAT("INSERT INTO temp (val) VALUES ('",REPLACE((SELECT GROUP_CONCAT( DISTINCT  `column`) AS data FROM `table`), ",", "'),('"),"');");
PREPARE stmt1 FROM @s1;
EXECUTE stmt1;
SELECT DISTINCT(val) FROM temp;


Please remember however to notstore CSV in your DB

但是请记住不要将 CSV 存储在您的数据库中



Per @Mark Amery - as this translates coma separated values into an INSERTstatement, be careful when running it on unsanitised data

根据@Mark Amery - 因为这会将昏迷分隔值转换为INSERT语句,所以在未清理的数据上运行它时要小心



Just to reiterate, please don'tstore CSV in your DB; this function is meant to translate CSV into sensible DB structure and not to be used anywhere in your code. If you have to use it in production, please rethink your DB structure

重申一下,请不要将 CSV 存储在您的数据库中;此函数旨在将 CSV 转换为合理的数据库结构,而不是在代码中的任何地方使用。如果您必须在生产中使用它,请重新考虑您的数据库结构

回答by Josias Iquabius

You can create a function for this:

您可以为此创建一个函数:

/**
* Split a string by string (Similar to the php function explode())
*
* @param VARCHAR(12) delim The boundary string (delimiter).
* @param VARCHAR(255) str The input string.
* @param INT pos The index of the string to return
* @return VARCHAR(255) The (pos)th substring
* @return VARCHAR(255) Returns the [pos]th string created by splitting the str parameter on boundaries formed by the delimiter.
* @{@example
*     SELECT SPLIT_STRING('|', 'one|two|three|four', 1);
*     This query
* }
*/
DROP FUNCTION IF EXISTS SPLIT_STRING;
CREATE FUNCTION SPLIT_STRING(delim VARCHAR(12), str VARCHAR(255), pos INT)
RETURNS VARCHAR(255) DETERMINISTIC
RETURN
    REPLACE(
        SUBSTRING(
            SUBSTRING_INDEX(str, delim, pos),
            LENGTH(SUBSTRING_INDEX(str, delim, pos-1)) + 1
        ),
        delim, ''
    );

Converting the magical pseudocode to use this, you would have:

转换神奇的伪代码以使用它,您将拥有:

SELECT e.`studentId`, SPLIT_STRING(',', c.`courseNames`, e.`courseId`)
FROM...

回答by Mark Amery

MySQL's only string-splitting function is SUBSTRING_INDEX(str, delim, count). You can use this, to, for example:

MySQL 唯一的字符串拆分函数是SUBSTRING_INDEX(str, delim, count). 例如,您可以使用它:

  • Return the item before the first separator in a string:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1);
    +--------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1) |
    +--------------------------------------------+
    | foo                                        |
    +--------------------------------------------+
    1 row in set (0.00 sec)
    
  • Return the item after the last separator in a string:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1);
    +---------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1) |
    +---------------------------------------------+
    | qux                                         |
    +---------------------------------------------+
    1 row in set (0.00 sec)
    
  • Return everything before the third separator in a string:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3);
    +--------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3) |
    +--------------------------------------------+
    | foo#bar#baz                                |
    +--------------------------------------------+
    1 row in set (0.00 sec)
    
  • Return the second item in a string, by chaining two calls:

    mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1);
    +----------------------------------------------------------------------+
    | SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1) |
    +----------------------------------------------------------------------+
    | bar                                                                  |
    +----------------------------------------------------------------------+
    1 row in set (0.00 sec)
    
  • 返回字符串中第一个分隔符之前的项目:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1);
    +--------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 1) |
    +--------------------------------------------+
    | foo                                        |
    +--------------------------------------------+
    1 row in set (0.00 sec)
    
  • 返回字符串中最后一个分隔符之后的项目:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1);
    +---------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', -1) |
    +---------------------------------------------+
    | qux                                         |
    +---------------------------------------------+
    1 row in set (0.00 sec)
    
  • 返回字符串中第三个分隔符之前的所有内容:

    mysql> SELECT SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3);
    +--------------------------------------------+
    | SUBSTRING_INDEX('foo#bar#baz#qux', '#', 3) |
    +--------------------------------------------+
    | foo#bar#baz                                |
    +--------------------------------------------+
    1 row in set (0.00 sec)
    
  • 通过链接两个调用返回字符串中的第二项:

    mysql> SELECT SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1);
    +----------------------------------------------------------------------+
    | SUBSTRING_INDEX(SUBSTRING_INDEX('foo#bar#baz#qux', '#', 2), '#', -1) |
    +----------------------------------------------------------------------+
    | bar                                                                  |
    +----------------------------------------------------------------------+
    1 row in set (0.00 sec)
    

In general, a simple way to get the nth element of a #-separated string (assuming that you know it definitely has at least n elements) is to do:

通常,获取#-separated 字符串的第 n 个元素的简单方法(假设您知道它肯定至少有 n 个元素)是这样做的:

SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1);

The inner SUBSTRING_INDEXcall discards the nth separator and everything after it, and then the outer SUBSTRING_INDEXcall discards everything except the final element that remains.

内部SUBSTRING_INDEX调用丢弃第 n 个分隔符及其后的所有内容,然后外部SUBSTRING_INDEX调用丢弃除剩余的最后一个元素之外的所有内容。

If you want a more robust solution that returns NULLif you ask for an element that doesn't exist (for instance, asking for the 5th element of 'a#b#c#d'), then you can count the delimiters using REPLACEand then conditionally return NULLusing IF():

如果您想要一个更健壮的解决方案,NULL如果您要求一个不存在的元素(例如,要求 的第 5 个元素'a#b#c#d'),则可以返回该解决方案,那么您可以使用 计算定界符REPLACE,然后NULL使用有条件地返回IF()

IF(
    LENGTH(your_string) - LENGTH(REPLACE(your_string, '#', '')) / LENGTH('#') < n - 1,
    NULL,
    SUBSTRING_INDEX(SUBSTRING_INDEX(your_string, '#', n), '#', -1)
)

Of course, this is pretty ugly and hard to understand! So you might want to wrap it in a function:

当然,这很丑陋且难以理解!所以你可能想把它包装在一个函数中:

CREATE FUNCTION split(string TEXT, delimiter TEXT, n INT)
RETURNS TEXT DETERMINISTIC
RETURN IF(
    (LENGTH(string) - LENGTH(REPLACE(string, delimiter, ''))) / LENGTH(delimiter) < n - 1,
    NULL,
    SUBSTRING_INDEX(SUBSTRING_INDEX(string, delimiter, n), delimiter, -1)
);

You can then use the function like this:

然后,您可以像这样使用该函数:

mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 3);
+----------------------------------+
| SPLIT('foo,bar,baz,qux', ',', 3) |
+----------------------------------+
| baz                              |
+----------------------------------+
1 row in set (0.00 sec)

mysql> SELECT SPLIT('foo,bar,baz,qux', ',', 5);
+----------------------------------+
| SPLIT('foo,bar,baz,qux', ',', 5) |
+----------------------------------+
| NULL                             |
+----------------------------------+
1 row in set (0.00 sec)

mysql> SELECT SPLIT('foo###bar###baz###qux', '###', 2);
+------------------------------------------+
| SPLIT('foo###bar###baz###qux', '###', 2) |
+------------------------------------------+
| bar                                      |
+------------------------------------------+
1 row in set (0.00 sec)

回答by DarkSide

Based on Alex answer above (https://stackoverflow.com/a/11022431/1466341) I came up with even better solution. Solution which doesn't contain exact one record ID.

基于 Alex 上面的回答(https://stackoverflow.com/a/11022431/1466341),我想出了更好的解决方案。不包含确切的一个记录 ID 的解决方案。

Assuming that the comma separated list is in table data.list, and it contains listing of codes from other table classification.code, you can do something like:

假设逗号分隔列表在 table 中data.list,并且它包含来自其他 table 的代码列表classification.code,您可以执行以下操作:

SELECT 
    d.id, d.list, c.code
FROM 
    classification c
    JOIN data d
        ON d.list REGEXP CONCAT('[[:<:]]', c.code, '[[:>:]]');

So if you have tables and data like this:

因此,如果您有这样的表格和数据:

CLASSIFICATION (code varchar(4) unique): ('A'), ('B'), ('C'), ('D')
MY_DATA (id int, list varchar(255)): (100, 'C,A,B'), (150, 'B,A,D'), (200,'B')

above SELECT will return

以上 SELECT 将返回

(100, 'C,A,B', 'A'),
(100, 'C,A,B', 'B'),
(100, 'C,A,B', 'C'),
(150, 'B,A,D', 'A'),
(150, 'B,A,D', 'B'),
(150, 'B,A,D', 'D'),
(200, 'B', 'B'),

回答by Alwin Kesler

I've resolved this kind of problem with a regular expression pattern. They tend to be slower than regular queries but it's an easy way to retrieve data in a comma-delimited query column

我已经用正则表达式模式解决了这种问题。它们往往比常规查询慢,但它是一种在逗号分隔的查询列中检索数据的简单方法

SELECT * 
FROM `TABLE`
WHERE `field` REGEXP ',?[SEARCHED-VALUE],?';

the greedy question mark helps to search at the beggining or the end of the string.

贪婪的问号有助于在字符串的开头或结尾进行搜索。

Hope that helps for anyone in the future

希望对未来的任何人都有帮助

回答by Alex Stevenson

Building on Alwin Kesler's solution, here's a bit of a more practical real world example.

基于 Alwin Kesler 的解决方案,这里有一些更实际的现实世界示例。

Assuming that the comma separated list is in my_table.list, and it's a listing of ID's for my_other_table.id, you can do something like:

假设逗号分隔列表在 my_table.list 中,并且它是 my_other_table.id 的 ID 列表,您可以执行以下操作:

SELECT 
    * 
FROM 
    my_other_table 
WHERE 
    (SELECT list FROM my_table WHERE id = '1234') REGEXP CONCAT(',?', my_other_table.id, ',?');

回答by Kickstart

It is possible to explode a string in a MySQL SELECT statement.

可以在 MySQL SELECT 语句中分解字符串。

Firstly generate a series of numbers up to the largest number of delimited values you wish to explode. Either from a table of integers, or by unioning numbers together. The following generates 100 rows giving the values 1 to 100. It can easily be expanded to give larger ranges (add another sub query giving the values 0 to 9 for hundreds - hence giving 0 to 999, etc).

首先生成一系列数字,直到您希望分解的最大数量的分隔值。要么来自整数表,要么通过将数字结合在一起。下面生成 100 行,给出 1 到 100 的值。它可以很容易地扩展以提供更大的范围(添加另一个子查询,为数百提供 0 到 9 的值 - 因此给出 0 到 999 等)。

SELECT 1 + units.i + tens.i * 10 AS aNum
FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens

This can be cross joined against your table to give you the values. Note that you use SUBSTRING_INDEX to get the delimited value up to a certain value, and then use SUBSTRING_INDEX to get that value, excluding previous ones.

这可以与您的表交叉连接以提供值。请注意,您使用 SUBSTRING_INDEX 将分隔值获取到某个值,然后使用 SUBSTRING_INDEX 获取该值,不包括以前的值。

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name
FROM clients
CROSS JOIN
(
    SELECT 1 + units.i + tens.i * 10 AS aNum, units.i + tens.i * 10 AS aSubscript
    FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
    CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
) sub0

As you can see there is a slight issue here that the last delimited value is repeated many times. To get rid of this you need to limit the range of numbers based on how many delimiters there are. This can be done by taking the length of the delimited field and comparing it to the length of the delimited field with the delimiters changed to '' (to remove them). From this you can get the number of delimiters:-

正如您所看到的,这里有一个小问题,即最后一个分隔值重复了很多次。要摆脱这种情况,您需要根据有多少分隔符来限制数字范围。这可以通过获取分隔字段的长度并将其与分隔字段的长度进行比较来完成,其中分隔符更改为 ''(以删除它们)。从中您可以获得分隔符的数量:-

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name
FROM clients
INNER JOIN
(
    SELECT 1 + units.i + tens.i * 10 AS aNum
    FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
    CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
) sub0
ON (1 + LENGTH(clients.courseNames) - LENGTH(REPLACE(clients.courseNames, ',', ''))) >= sub0.aNum

In the original example field you could (for example) count the number of students on each course based on this. Note that I have changed the sub query that gets the range of numbers to bring back 2 numbers, 1 is used to determine the course name (as these are based on starting at 1) and the other gets the subscript (as they are based starting at 0).

在原始示例字段中,您可以(例如)基于此计算每门课程的学生人数。请注意,我更改了获取数字范围的子查询以带回 2 个数字,1 用于确定课程名称(因为这些基于从 1 开始),另一个获取下标(因为它们基于开始在 0)。

SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(clients.courseNames, ',', sub0.aNum), ',', -1) AS a_course_name, COUNT(clientenrols.studentId)
FROM clients
INNER JOIN
(
    SELECT 1 + units.i + tens.i * 10 AS aNum, units.i + tens.i * 10 AS aSubscript
    FROM (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) units
    CROSS JOIN (SELECT 0 AS i UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9) tens
) sub0
ON (1 + LENGTH(clients.courseNames) - LENGTH(REPLACE(clients.courseNames, ',', ''))) >= sub0.aNum
LEFT OUTER JOIN clientenrols
ON clientenrols.courseId = sub0.aSubscript
GROUP BY a_course_name

As you can see, it is possible but quite messy. And with little opportunity to use indexes it is not going to be efficient. Further the range must cope with the greatest number of delimited values, and works by excluding lots of duplicates; if the max number of delimited values is very large then this will slow things down dramatically. Overall it is generally far better to just properly normalise the database.

如您所见,这是可能的,但非常混乱。并且几乎没有机会使用索引,它不会有效率。此外,范围必须处理最多数量的分隔值,并通过排除大量重复项来工作;如果分隔值的最大数量非常大,那么这将大大减慢速度。总的来说,正确地规范化数据库通常要好得多。

回答by user1894169

If you need get table from string with delimiters:

如果您需要使用分隔符从字符串中获取表格:

SET @str = 'function1;function2;function3;function4;aaa;bbbb;nnnnn';
SET @delimeter = ';';
SET @sql_statement = CONCAT('SELECT '''
                ,REPLACE(@str, @delimeter, ''' UNION ALL SELECT ''')
                ,'''');
SELECT @sql_statement;
SELECT 'function1' UNION ALL SELECT 'function2' UNION ALL SELECT 'function3' UNION ALL SELECT 'function4' UNION ALL SELECT 'aaa' UNION ALL SELECT 'bbbb' UNION ALL SELECT 'nnnnn'

回答by suraj deep

SELECT
  tab1.std_name, tab1.stdCode, tab1.payment,
  SUBSTRING_INDEX(tab1.payment, '|', 1) as rupees,
  SUBSTRING(tab1.payment, LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 1)) + 2,LENGTH(SUBSTRING_INDEX(tab1.payment, '|', 2))) as date
FROM (
  SELECT DISTINCT
    si.std_name, hfc.stdCode,
    if(isnull(hfc.payDate), concat(hfc.coutionMoneyIn,'|', year(hfc.startDtae), '-',  monthname(hfc.startDtae)), concat(hfc.payMoney, '|', monthname(hfc.payDate), '-', year(hfc.payDate))) AS payment
  FROM hostelfeescollection hfc
  INNER JOIN hostelfeecollectmode hfm ON hfc.tranId = hfm.tranId
  INNER JOIN student_info_1 si ON si.std_code = hfc.stdCode
  WHERE hfc.tranId = 'TRAN-AZZZY69454'
) AS tab1