与 PHP strip_tags 等效的 MySQL 查询是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7654436/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 21:14:40  来源:igfitidea点击:

What is the MySQL query equivalent of PHP strip_tags?

mysqlstrip-tags

提问by faq

I have a large database which contains records that have <a>tags in them and I would like to remove them. Of course there is the method where I create a PHP script that selects all, uses strip_tagsand updates the database, but this takes a long time. So how can I do this with a simple (or complicated) MySQL query?

我有一个大型数据库,其中包含带有<a>标签的记录,我想删除它们。当然有一种方法,我创建一个选择所有,使用strip_tags和更新数据库的PHP脚本,但这需要很长时间。那么我怎样才能用一个简单(或复杂)的 MySQL 查询来做到这一点呢?

采纳答案by duskwuff -inactive-

I don't believe there's any efficient way to do this in MySQL alone.

我不相信单独在 MySQL 中有任何有效的方法可以做到这一点。

MySQL does have a REPLACE()function, but it can only replace constant strings, not patterns. You could possibly write a MySQL stored function to search for and replace tags, but at that point you're probably better off writing a PHP script to do the job. It might not be quiteas fast, but it will probably be faster to write.

MySQL 确实有REPLACE()函数,但它只能替换常量字符串,不能替换模式。您可能会编写一个 MySQL 存储函数来搜索和替换标签,但此时您最好编写一个 PHP 脚本来完成这项工作。它可能不是非常快,但它可能会更快地写。

回答by Boann

Here you go:

干得好:

CREATE FUNCTION `strip_tags`($str text) RETURNS text
BEGIN
    DECLARE $start, $end INT DEFAULT 1;
    LOOP
        SET $start = LOCATE("<", $str, $start);
        IF (!$start) THEN RETURN $str; END IF;
        SET $end = LOCATE(">", $str, $start);
        IF (!$end) THEN SET $end = $start; END IF;
        SET $str = INSERT($str, $start, $end - $start + 1, "");
    END LOOP;
END;

I made sure it removes mismatched opening brackets because they're dangerous, though it ignores any unpaired closing brackets because they're harmless.

我确保它删除了不匹配的左括号,因为它们很危险,尽管它忽略了任何不成对的右括号,因为它们是无害的。

mysql> select strip_tags('<span>hel<b>lo <a href="world">wo<>rld</a> <<x>again<.');
+----------------------------------------------------------------------+
| strip_tags('<span>hel<b>lo <a href="world">wo<>rld</a> <<x>again<.') |
+----------------------------------------------------------------------+
| hello world again.                                                   |
+----------------------------------------------------------------------+
1 row in set

回答by Marco Marsala

MySQL >= 5.5 provides XML functions to solve your issue:

MySQL >= 5.5 提供了 XML 函数来解决您的问题:

SELECT ExtractValue(field, '//text()') FROM table;

Reference: https://dev.mysql.com/doc/refman/5.5/en/xml-functions.html

参考:https: //dev.mysql.com/doc/refman/5.5/en/xml-functions.html

回答by phenicie

I am passing this code on, seems very similar to the above. Worked for me, hope it helps.

我正在传递这段代码,看起来与上面的非常相似。为我工作,希望它有帮助。

BEGIN
  DECLARE iStart, iEnd, iLength   INT;

  WHILE locate('<', Dirty) > 0 AND locate('>', Dirty, locate('<', Dirty)) > 0
  DO
    BEGIN
      SET iStart = locate('<', Dirty), iEnd = locate('>', Dirty, locate('<', Dirty));
      SET iLength = (iEnd - iStart) + 1;
      IF iLength > 0 THEN
        BEGIN
          SET Dirty = insert(Dirty, iStart, iLength, '');
        END;
      END IF;
    END;
  END WHILE;
  RETURN Dirty;
END

回答by Scott2B

Boann's works once I added SET $str = COALESCE($str, '');.

一旦我添加了 Boann 的作品SET $str = COALESCE($str, '');

from this post:

从这篇文章

Also to note, you may want to put a SET $str = COALESCE($str, ''); just before the loop otherwise null values may cause a crash/never ending query. – Tom C Aug 17 at 9:51

还要注意的是,您可能想要放置一个 SET $str = COALESCE($str, ''); 就在循环之前,否则空值可能会导致崩溃/永无止境的查询。– 汤姆 C 8 月 17 日 9:51

回答by billynoah

I'm using the lib_mysqludf_preglibrary for this and a regex like this:

我为此使用lib_mysqludf_preg库和这样的正则表达式:

SELECT PREG_REPLACE('#<[^>]+>#',' ',cell) FROM table;

Also did it like this for rows which with encoded html entities:

对于带有编码的 html 实体的行,也是这样做的:

SELECT PREG_REPLACE('#&lt;.+?&gt;#',' ',cell) FROM table;

There are probably cases where these might fail but I haven't encountered any and they're reasonably fast.

在某些情况下,这些可能会失败,但我没有遇到过,而且速度相当快。

回答by ajmedway

I just extended the answer @boann to allow targetting of any specific tag so that we can replace out the tags one by one with each function call. You just need pass the tag parameter, e.g. 'a'to replace out all opening/closing anchor tags. This answers the question asked by OP, unlike the accepted answer, which strips out ALL tags.

我只是扩展了答案@boann 以允许定位任何特定标签,以便我们可以在每次函数调用时一一替换标签。您只需要传递 tag 参数,例如'a'替换所有打开/关闭锚标签。这回答了 OP 提出的问题,与接受的答案不同,后者去除了所有标签。

# MySQL function to programmatically replace out specified html tags from text/html fields

# run this to drop/update the stored function
DROP FUNCTION IF EXISTS `strip_tags`;

DELIMITER |

# function to nuke all opening and closing tags of type specified in argument 2
CREATE FUNCTION `strip_tags`($str text, $tag text) RETURNS text
BEGIN
    DECLARE $start, $end INT DEFAULT 1;
    SET $str = COALESCE($str, '');
    LOOP
        SET $start = LOCATE(CONCAT('<', $tag), $str, $start);
        IF (!$start) THEN RETURN $str; END IF;
        SET $end = LOCATE('>', $str, $start);
        IF (!$end) THEN SET $end = $start; END IF;
        SET $str = INSERT($str, $start, $end - $start + 1, '');
        SET $str = REPLACE($str, CONCAT('</', $tag, '>'), '');
    END LOOP;
END;

| DELIMITER ;

# test select to nuke all opening <a> tags
SELECT 
    STRIP_TAGS(description, 'a') AS stripped
FROM
    tmpcat;

# run update query to replace out all <a> tags
UPDATE tmpcat
SET 
    description = STRIP_TAGS(description, 'a');

回答by Gene Kelly

Compatible with MySQL 8+ and MariaDB 10.0.5+

兼容 MySQL 8+ 和 MariaDB 10.0.5+

SELECT REGEXP_REPLACE(body, '<[^>]*>+', '') FROM app_cms_sections

SELECT REGEXP_REPLACE(body, '<[^>]*>+', '') FROM app_cms_sections

回答by Foxinni

REPLACE()works pretty well.

REPLACE()效果很好。

The subtle approach:

微妙的方法:

 REPLACE(REPLACE(node.body,'<p>',''),'</p>','') as `post_content`

...and the not so subtle: (Converting strings into slugs)

...和不那么微妙的:(将字符串转换为 slugs)

 LOWER(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(TRIM(node.title), ':', ''), 'é', 'e'), ')', ''), '(', ''), ',', ''), '\', ''), '\/', ''), '\"', ''), '?', ''), '\'', ''), '&', ''), '!', ''), '.', ''), '–', ''), ' ', '-'), '--', '-'), '--', '-'), ''', '')) as `post_name`