如何创建 MySQL 分层递归查询

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/20215744/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:32:46  来源:igfitidea点击:

How to create a MySQL hierarchical recursive query

mysqlsqlhierarchical-datarecursive-query

提问by Tarun Parswani

I have a MySQL table which is as follows:

我有一个 MySQL 表,如下所示:

id | name        | parent_id
19 | category1   | 0
20 | category2   | 19
21 | category3   | 20
22 | category4   | 21
......

Now, I want to have a single MySQL query to which I simply supply the id [for instance say 'id = 19'] then I should get all its child ids [i.e. result should have ids '20,21,22'].... Also, the hierarchy of the children is not known it can vary....

现在,我想要一个单一的 MySQL 查询,我只需提供 id [例如说 'id = 19'] 然后我应该得到它的所有子 ID [即结果应该有 ids '20,21,22']。 ......此外,孩子的层次结构是未知的,它可能会有所不同......

Also, I already have the solution using the for loop..... Let me know how to achieve the same using a single MySQL query if possible.

另外,我已经有了使用 for 循环的解决方案..... 如果可能,请告诉我如何使用单个 MySQL 查询实现相同的目标。

回答by trincot

For MySQL 8+:use the recursive withsyntax.
For MySQL 5.x:use inline variables, path IDs, or self-joins.

对于MySQL 8+:使用递归with语法。
对于MySQL 5.x:使用内联变量、路径 ID 或自联接。

MySQL 8+

MySQL 8+

with recursive cte (id, name, parent_id) as (
  select     id,
             name,
             parent_id
  from       products
  where      parent_id = 19
  union all
  select     p.id,
             p.name,
             p.parent_id
  from       products p
  inner join cte
          on p.parent_id = cte.id
)
select * from cte;

The value specified in parent_id = 19should be set to the idof the parent you want to select all the descendants of.

中指定的值parent_id = 19应设置为id要选择其所有后代的父代。

MySQL 5.x

MySQL 5.x

For MySQL versions that do not support Common Table Expressions (up to version 5.7), you would achieve this with the following query:

对于不支持公用表表达式(最高版本 5.7)的 MySQL 版本,您可以使用以下查询来实现:

select  id,
        name,
        parent_id 
from    (select * from products
         order by parent_id, id) products_sorted,
        (select @pv := '19') initialisation
where   find_in_set(parent_id, @pv)
and     length(@pv := concat(@pv, ',', id))

Here is a fiddle.

这是一个小提琴

Here, the value specified in @pv := '19'should be set to the idof the parent you want to select all the descendants of.

此处,指定的值@pv := '19'应设置为id要选择其所有后代的父代的 。

This will work also if a parent has multiplechildren. However, it is required that each record fulfills the condition parent_id < id, otherwise the results will not be complete.

如果父母有多个孩子,这也将起作用。但是,要求每条记录都满足条件parent_id < id,否则结果将不完整。

Variable assignments inside a query

查询中的变量赋值

This query uses specific MySQL syntax: variables are assigned and modified during its execution. Some assumptions are made about the order of execution:

此查询使用特定的 MySQL 语法:在执行期间分配和修改变量。对执行顺序做了一些假设:

  • The fromclause is evaluated first. So that is where @pvgets initialised.
  • The whereclause is evaluated for each record in the order of retrieval from the fromaliases. So this is where a condition is put to only include records for which the parent was already identified as being in the descendant tree (all descendants of the primary parent are progressively added to @pv).
  • The conditions in this whereclause are evaluated in order, and the evaluation is interrupted once the total outcome is certain. Therefore the second condition must be in second place, as it adds the idto the parent list, and this should only happen if the idpasses the first condition. The lengthfunction is only called to make sure this condition is always true, even if the pvstring would for some reason yield a falsy value.
  • from子句第一评价。所以这就是@pv初始化的地方。
  • where按照从from别名中检索的顺序为每条记录评估该子句。因此,这是放置条件以仅包括父代已被识别为在后代树中的记录(主要父代的所有后代都逐渐添加到@pv)。
  • 本条中的条件按where顺序进行评估,一旦总结果确定,则中断评估。因此,第二个条件必须排在第二位,因为它将添加id到父列表中,并且只有在id通过第一个条件时才会发生这种情况。length调用该函数只是为了确保此条件始终为真,即使pv字符串由于某种原因会产生假值。

All in all, one may find these assumptions too risky to rely on. The documentationwarns:

总而言之,人们可能会发现这些假设风险太大而无法依赖。该文件警告说:

you might get the results you expect, but this is not guaranteed [...] the order of evaluation for expressions involving user variables is undefined.

您可能会得到您期望的结果,但这并不能保证[...] 涉及用户变量的表达式的求值顺序未定义。

So even though it works consistently with the above query, the evaluation order may still change, for instance when you add conditions or use this query as a view or sub-query in a larger query. It is a "feature" that will be removed in a future MySQL release:

因此,即使它与上述查询一致,评估顺序仍可能发生变化,例如,当您添加条件或将此查询用作更大查询中的视图或子查询时。这是一个“功能”,将在未来的 MySQL 版本中删除

Previous releases of MySQL made it possible to assign a value to a user variable in statements other than SET. This functionality is supported in MySQL 8.0 for backward compatibility but is subject to removal in a future release of MySQL.

以前的 MySQL 版本可以在除SET. MySQL 8.0 支持此功能以实现向后兼容性,但在未来的 MySQL 版本中可能会被删除。

As stated above, from MySQL 8.0 onward you should use the recursive withsyntax.

如上所述,从 MySQL 8.0 开始,您应该使用递归with语法。

Efficiency

效率

For very large data sets this solution might get slow, as the find_in_setoperation is not the most ideal way to find a number in a list, certainly not in a list that reaches a size in the same order of magnitude as the number of records returned.

对于非常大的数据集,此解决方案可能会变慢,因为该find_in_set操作不是在列表中查找数字的最理想方法,当然不是在大小与返回的记录数量相同数量级的列表中查找。

Alternative 1: with recursive, connect by

选择1: with recursiveconnect by

More and more databases implement the SQL:1999 ISO standard WITH [RECURSIVE]syntaxfor recursive queries (e.g. Postgres 8.4+, SQL Server 2005+, DB2, Oracle 11gR2+, SQLite 3.8.4+, Firebird 2.1+, H2, HyperSQL 2.1.0+, Teradata, MariaDB 10.2.2+). And as of version 8.0, also MySQL supports it. See the top of this answer for the syntax to use.

越来越多的数据库执行SQL:1999 ISO标准WITH [RECURSIVE]语法的递归查询(如Postgres的8.4+SQL Server的2005+DB2甲骨文11gR2的+SQLite的3.8.4+火鸟2.1+H2的HyperSQL 2.1.0+Teradata的, MariaDB 10.2.2+)。从8.0 版开始,MySQL 也支持它。有关要使用的语法,请参阅此答案的顶部。

Some databases have an alternative, non-standard syntax for hierarchical look-ups, such as the CONNECT BYclause available on Oracle, DB2, Informix, CUBRIDand other databases.

某些数据库具有用于分层查找的替代性非标准语法,例如OracleDB2InformixCUBRID和其他数据库CONNECT BY上可用的子句。

MySQL version 5.7 does not offer such a feature. When your database engine provides this syntax or you can migrate to one that does, then that is certainly the best option to go for. If not, then also consider the following alternatives.

MySQL 5.7 版没有提供这样的功能。如果您的数据库引擎提供了这种语法,或者您可以迁移到提供这种语法的引擎,那么这无疑是最好的选择。如果不是,那么还可以考虑以下替代方案。

Alternative 2: Path-style Identifiers

备选方案 2:路径式标识符

Things become a lot easier if you would assign idvalues that contain the hierarchical information: a path. For example, in your case this could look like this:

如果您分配id包含分层信息的值:路径,事情会变得容易得多。例如,在您的情况下,这可能如下所示:

ID       | NAME
19       | category1   
19/1     | category2  
19/1/1   | category3  
19/1/1/1 | category4  

Then your selectwould look like this:

然后你select看起来像这样:

select  id,
        name 
from    products
where   id like '19/%'

Alternative 3: Repeated Self-joins

备选方案 3:重复自联接

If you know an upper limit for how deep your hierarchy tree can become, you can use a standard sqlquery like this:

如果您知道层次结构树的深度上限,则可以使用如下标准sql查询:

select      p6.parent_id as parent6_id,
            p5.parent_id as parent5_id,
            p4.parent_id as parent4_id,
            p3.parent_id as parent3_id,
            p2.parent_id as parent2_id,
            p1.parent_id as parent_id,
            p1.id as product_id,
            p1.name
from        products p1
left join   products p2 on p2.id = p1.parent_id 
left join   products p3 on p3.id = p2.parent_id 
left join   products p4 on p4.id = p3.parent_id  
left join   products p5 on p5.id = p4.parent_id  
left join   products p6 on p6.id = p5.parent_id
where       19 in (p1.parent_id, 
                   p2.parent_id, 
                   p3.parent_id, 
                   p4.parent_id, 
                   p5.parent_id, 
                   p6.parent_id) 
order       by 1, 2, 3, 4, 5, 6, 7;

See this fiddle

看到这个小提琴

The wherecondition specifies which parent you want to retrieve the descendants of. You can extend this query with more levels as needed.

where条件指定要检索其后代的父级。您可以根据需要使用更多级别扩展此查询。

回答by Damodaran

From the blog Managing Hierarchical Data in MySQL

来自博客在 MySQL 中管理分层数据

Table structure

表结构

+-------------+----------------------+--------+
| category_id | name                 | parent |
+-------------+----------------------+--------+
|           1 | ELECTRONICS          |   NULL |
|           2 | TELEVISIONS          |      1 |
|           3 | TUBE                 |      2 |
|           4 | LCD                  |      2 |
|           5 | PLASMA               |      2 |
|           6 | PORTABLE ELECTRONICS |      1 |
|           7 | MP3 PLAYERS          |      6 |
|           8 | FLASH                |      7 |
|           9 | CD PLAYERS           |      6 |
|          10 | 2 WAY RADIOS         |      6 |
+-------------+----------------------+--------+

Query:

询问:

SELECT t1.name AS lev1, t2.name as lev2, t3.name as lev3, t4.name as lev4
FROM category AS t1
LEFT JOIN category AS t2 ON t2.parent = t1.category_id
LEFT JOIN category AS t3 ON t3.parent = t2.category_id
LEFT JOIN category AS t4 ON t4.parent = t3.category_id
WHERE t1.name = 'ELECTRONICS';

Output

输出

+-------------+----------------------+--------------+-------+
| lev1        | lev2                 | lev3         | lev4  |
+-------------+----------------------+--------------+-------+
| ELECTRONICS | TELEVISIONS          | TUBE         | NULL  |
| ELECTRONICS | TELEVISIONS          | LCD          | NULL  |
| ELECTRONICS | TELEVISIONS          | PLASMA       | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | MP3 PLAYERS  | FLASH |
| ELECTRONICS | PORTABLE ELECTRONICS | CD PLAYERS   | NULL  |
| ELECTRONICS | PORTABLE ELECTRONICS | 2 WAY RADIOS | NULL  |
+-------------+----------------------+--------------+-------+

Most users at one time or another have dealt with hierarchical data in a SQL database and no doubt learned that the management of hierarchical data is not what a relational database is intended for. The tables of a relational database are not hierarchical (like XML), but are simply a flat list. Hierarchical data has a parent-child relationship that is not naturally represented in a relational database table. Read more

大多数用户曾经或多次处理过 SQL 数据库中的分层数据,并且毫无疑问地了解到分层数据的管理不是关系数据库的目的。关系数据库的表不是分层的(如 XML),而只是一个平面列表。分层数据具有父子关系,这种关系在关系数据库表中并不自然表示。 阅读更多

Refer the blog for more details.

有关更多详细信息,请参阅博客。

EDIT:

编辑:

select @pv:=category_id as category_id, name, parent from category
join
(select @pv:=19)tmp
where parent=@pv

Output:

输出:

category_id name    parent
19  category1   0
20  category2   19
21  category3   20
22  category4   21

Reference: How to do the Recursive SELECT query in Mysql?

参考:如何在Mysql中进行递归SELECT查询?

回答by Dheerendra Kulkarni

Did the same thing for another quetion here

在这里为另一个问题做了同样的事情

Mysql select recursive get all child with multiple level

Mysql选择递归获取具有多个级别的所有子项

The query will be :

查询将是:

SELECT GROUP_CONCAT(lv SEPARATOR ',') FROM (
  SELECT @pv:=(
    SELECT GROUP_CONCAT(id SEPARATOR ',')
    FROM table WHERE parent_id IN (@pv)
  ) AS lv FROM table 
  JOIN
  (SELECT @pv:=1)tmp
  WHERE parent_id IN (@pv)
) a;

回答by Fandi Susanto

Try these:

试试这些:

Table definition:

表定义:

DROP TABLE IF EXISTS category;
CREATE TABLE category (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(20),
    parent_id INT,
    CONSTRAINT fk_category_parent FOREIGN KEY (parent_id)
    REFERENCES category (id)
) engine=innodb;

Experimental rows:

实验行:

INSERT INTO category VALUES
(19, 'category1', NULL),
(20, 'category2', 19),
(21, 'category3', 20),
(22, 'category4', 21),
(23, 'categoryA', 19),
(24, 'categoryB', 23),
(25, 'categoryC', 23),
(26, 'categoryD', 24);

Recursive Stored procedure:

递归存储过程:

DROP PROCEDURE IF EXISTS getpath;
DELIMITER $$
CREATE PROCEDURE getpath(IN cat_id INT, OUT path TEXT)
BEGIN
    DECLARE catname VARCHAR(20);
    DECLARE temppath TEXT;
    DECLARE tempparent INT;
    SET max_sp_recursion_depth = 255;
    SELECT name, parent_id FROM category WHERE id=cat_id INTO catname, tempparent;
    IF tempparent IS NULL
    THEN
        SET path = catname;
    ELSE
        CALL getpath(tempparent, temppath);
        SET path = CONCAT(temppath, '/', catname);
    END IF;
END$$
DELIMITER ;

Wrapper function for the stored procedure:

存储过程的包装函数:

DROP FUNCTION IF EXISTS getpath;
DELIMITER $$
CREATE FUNCTION getpath(cat_id INT) RETURNS TEXT DETERMINISTIC
BEGIN
    DECLARE res TEXT;
    CALL getpath(cat_id, res);
    RETURN res;
END$$
DELIMITER ;

Select example:

选择示例:

SELECT id, name, getpath(id) AS path FROM category;

Output:

输出:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 19 | category1 | category1                               |
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
| 23 | categoryA | category1/categoryA                     |
| 24 | categoryB | category1/categoryA/categoryB           |
| 25 | categoryC | category1/categoryA/categoryC           |
| 26 | categoryD | category1/categoryA/categoryB/categoryD |
+----+-----------+-----------------------------------------+

Filtering rows with certain path:

过滤具有特定路径的行:

SELECT id, name, getpath(id) AS path FROM category HAVING path LIKE 'category1/category2%';

Output:

输出:

+----+-----------+-----------------------------------------+
| id | name      | path                                    |
+----+-----------+-----------------------------------------+
| 20 | category2 | category1/category2                     |
| 21 | category3 | category1/category2/category3           |
| 22 | category4 | category1/category2/category3/category4 |
+----+-----------+-----------------------------------------+

回答by Der Zinger

The best approach I've come up with is

我想出的最好方法是

  1. Use lineage to store\sort\trace trees. That's more than enough, and works thousands times faster for reading than any other approach. It also allows to stay on that pattern even if DB will change(as ANY db will allow that pattern to be used)
  2. Use function that determines lineage for specific ID.
  3. Use it as you wish (in selects, or on CUD operations, or even by jobs).
  1. 使用谱系来存储\排序\跟踪树。这已经绰绰有余,阅读速度比任何其他方法快数千倍。即使 DB 发生变化,它也允许保持该模式(因为任何 db 都允许使用该模式)
  2. 使用确定特定 ID 的谱系的函数。
  3. 随心所欲地使用它(在选择中,或在 CUD 操作中,甚至在作业中)。

Lineage approach descr. can be found wherever, for example Hereor here. As of function - thatis what enspired me.

血统方法描述 可以在任何地方找到,例如 Herehere。至于功能 -就是激励我的原因。

In the end - got more-or-less simple, relatively fast, and SIMPLE solution.

最后 - 得到了或多或少简单、相对快速且简单的解决方案。

Function's body

函数体

-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$

CREATE DEFINER=`root`@`localhost` FUNCTION `get_lineage`(the_id INT) RETURNS text CHARSET utf8
    READS SQL DATA
BEGIN

 DECLARE v_rec INT DEFAULT 0;

 DECLARE done INT DEFAULT FALSE;
 DECLARE v_res text DEFAULT '';
 DECLARE v_papa int;
 DECLARE v_papa_papa int DEFAULT -1;
 DECLARE csr CURSOR FOR 
  select _id,parent_id -- @n:=@n+1 as rownum,T1.* 
  from 
    (SELECT @r AS _id,
        (SELECT @r := table_parent_id FROM table WHERE table_id = _id) AS parent_id,
        @l := @l + 1 AS lvl
    FROM
        (SELECT @r := the_id, @l := 0,@n:=0) vars,
        table m
    WHERE @r <> 0
    ) T1
    where T1.parent_id is not null
 ORDER BY T1.lvl DESC;
 DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
    open csr;
    read_loop: LOOP
    fetch csr into v_papa,v_papa_papa;
        SET v_rec = v_rec+1;
        IF done THEN
            LEAVE read_loop;
        END IF;
        -- add first
        IF v_rec = 1 THEN
            SET v_res = v_papa_papa;
        END IF;
        SET v_res = CONCAT(v_res,'-',v_papa);
    END LOOP;
    close csr;
    return v_res;
END

And then you just

然后你就

select get_lineage(the_id)

Hope it helps somebody :)

希望它可以帮助某人:)

回答by Justin Howard

If you need quick read speed, the best option is to use a closure table. A closure table contains a row for each ancestor/descendant pair. So in your example, the closure table would look like

如果您需要快速读取速度,最好的选择是使用闭表。闭包表为每个祖先/后代对包含一行。所以在你的例子中,闭包表看起来像

ancestor | descendant | depth
0        | 0          | 0
0        | 19         | 1
0        | 20         | 2
0        | 21         | 3
0        | 22         | 4
19       | 19         | 0
19       | 20         | 1
19       | 21         | 3
19       | 22         | 4
20       | 20         | 0
20       | 21         | 1
20       | 22         | 2
21       | 21         | 0
21       | 22         | 1
22       | 22         | 0

Once you have this table, hierarchical queries become very easy and fast. To get all the descendants of category 20:

一旦你有了这个表,分层查询就变得非常容易和快速。要获取类别 20 的所有后代:

SELECT cat.* FROM categories_closure AS cl
INNER JOIN categories AS cat ON cat.id = cl.descendant
WHERE cl.ancestor = 20 AND cl.depth > 0

Of course, there is a big downside whenever you use denormalized data like this. You need to maintain the closure table alongside your categories table. The best way is probably to use triggers, but it is somewhat complex to correctly track inserts/updates/deletes for closure tables. As with anything, you need to look at your requirements and decide what approach is best for you.

当然,每当您使用这样的非规范化数据时,都会有很大的缺点。您需要在类别表旁边维护闭包表。最好的方法可能是使用触发器,但正确跟踪闭包表的插入/更新/删除有点复杂。与任何事情一样,您需要查看您的要求并决定哪种方法最适合您。

Edit: See the question What are the options for storing hierarchical data in a relational database?for more options. There are different optimal solutions for different situations.

编辑:请参阅问题在关系数据库中存储分层数据有哪些选项?更多选择。不同的情况有不同的最优解。

回答by lynx_74

Simple query to list child's of first recursion:

列出第一次递归的子级的简单查询:

select @pv:=id as id, name, parent_id
from products
join (select @pv:=19)tmp
where parent_id=@pv

Result:

结果:

id  name        parent_id
20  category2   19
21  category3   20
22  category4   21
26  category24  22

... with left join:

...与左连接:

select
    @pv:=p1.id as id
  , p2.name as parent_name
  , p1.name name
  , p1.parent_id
from products p1
join (select @pv:=19)tmp
left join products p2 on p2.id=p1.parent_id -- optional join to get parent name
where p1.parent_id=@pv

The solution of @tincot to list all child's:

@tincot 列出所有孩子的解决方案:

select  id,
        name,
        parent_id 
from    (select * from products
         order by parent_id, id) products_sorted,
        (select @pv := '19') initialisation
where   find_in_set(parent_id, @pv) > 0
and     @pv := concat(@pv, ',', id)

Test it online with Sql Fiddleand see all results.

使用Sql Fiddle在线测试并查看所有结果。

http://sqlfiddle.com/#!9/a318e3/4/0

http://sqlfiddle.com/#!9/a318e3/4/0

回答by Phil John

You can do it like this in other databases quite easily with a recursive query (YMMV on performance).

您可以使用递归查询(性能方面的 YMMV)轻松地在其他数据库中执行此操作。

The other way to do it is to store two extra bits of data, a left and right value. The left and right value are derived from a pre-order traversal of the tree structure you're representing.

另一种方法是存储两个额外的数据位,一个左值和右值。左值和右值来自您所表示的树结构的前序遍历。

This is know as Modified Preorder Tree Traversal and lets you run a simple query to get all parent values at once. It also goes by the name "nested set".

这称为 Modified Preorder Tree Traversal,可让您运行一个简单的查询以一次获取所有父值。它也被称为“嵌套集”。

回答by Saleh Mosleh

Just use BlueM/treephp class for make tree of a self-relation table in mysql.

只需使用BlueM/treephp 类在 mysql 中制作自相关表的树。

Tree?and?Tree\Node?are PHP classes for handling data that is structured hierarchically using parent ID references. A typical example is a table in a relational database where each record's “parent” field references the primary key of another record. Of course, Tree cannot only use data originating from a database, but anything: you supply the data, and Tree uses it, regardless of where the data came from and how it was processed. read more

Tree? 和?Tree\Node? 是PHP 类,用于处理使用父ID 引用分层构建的数据。一个典型的例子是关系数据库中的一个表,其中每条记录的“父”字段引用另一条记录的主键。当然,Tree 不仅可以使用源自数据库的数据,还可以使用任何数据:您提供数据,Tree 使用它,而不管数据来自何处以及如何处理。阅读更多

Here is an example of using BlueM/tree:

下面是一个使用 BlueM/tree 的例子:

<?php 
require '/path/to/vendor/autoload.php'; $db = new PDO(...); // Set up your database connection 
$stm = $db->query('SELECT id, parent, title FROM tablename ORDER BY title'); 
$records = $stm->fetchAll(PDO::FETCH_ASSOC); 
$tree = new BlueM\Tree($records); 
...

回答by Pradip Rupareliya

enter image description here

enter image description here

It's a categorytable.

这是一个类别表。

SELECT  id,
        NAME,
        parent_category 
FROM    (SELECT * FROM category
         ORDER BY parent_category, id) products_sorted,
        (SELECT @pv := '2') initialisation
WHERE   FIND_IN_SET(parent_category, @pv) > 0
AND     @pv := CONCAT(@pv, ',', id)

Output::enter image description here

输出::enter image description here