通过正则表达式从 Oracle 中的字符串中删除简单的 HTML 标签，需要解释

Question

提问by Basti

I do not understand, why my columns reg1 and reg2 remove "bbb" from my string, and only reg3 works as expected.

我不明白，为什么我的 reg1 和 reg2 列从我的字符串中删除了“bbb”，而只有 reg3 按预期工作。

WITH t AS (SELECT 'aaa <b>bbb</b> ccc' AS teststring FROM dual)

SELECT
  teststring,
  regexp_replace(teststring, '<.+>') AS reg1,
  regexp_replace(teststring, '<.*>') AS reg2,
  regexp_replace(teststring, '<.*?>') AS reg3
FROM t


TESTSTRING             REG1        REG2          REG3
aaa <b>bbb</b> ccc     aaa ccc     aaa ccc       aaa bbb ccc

Thanks a lot!

非常感谢！

Answer 1

回答by Olivier Jacot-Descombes

Because regex is greedy by default. I.e. the expressions .*or .+try to take as many characters as possible. Therefore <.+>will span from the first <to the last >. Make it lazy by using the lazy operator ?:

因为默认情况下正则表达式是贪婪的。即表达式.*或.+尝试采用尽可能多的字符。因此<.+>将从第一个跨越<到最后一个>。使用惰性运算符使其惰性?：

regexp_replace(teststring, '<.+?>')

or

或者

regexp_replace(teststring, '<.*?>')

Now, the search for >will stop at the first >encountered.

现在，搜索>将在>遇到的第一个停止。

Note that .includes >as well, therefore the greedy variant (without ?) swallows all the >but the last.

请注意，.包括>也包括在内，因此贪婪的变体（没有?）吞下>除了最后一个之外的所有。

Answer 2

回答by DevilPinky

Because the first one and the second one are finding this match: <b>bbb</b>- in this case b>bbb</bmatches both .*and .+

因为第一个和第二个正在找到这个匹配： <b>bbb</b>- 在这种情况下b>bbb</b匹配.*和.+

The third one also won't do what you need. You are looking for something like this: <[^>]*>. But you also need to replace all matches with ""

第三个也不会做你需要的。你正在寻找这样的东西：<[^>]*>。但是您还需要将所有匹配项替换为“”

Answer 3

回答by A_Developer_in_Austin_TX

If you are merely trying to display the string without all the HTML tags, you can use the function: utl_i18n.unescape_reference(column_name)

如果你只是想显示没有所有 HTML 标签的字符串，你可以使用函数：utl_i18n.unescape_reference(column_name)

通过正则表达式从 Oracle 中的字符串中删除简单的 HTML 标签，需要解释

提问by Basti

回答by Olivier Jacot-Descombes

回答by DevilPinky

回答by A_Developer_in_Austin_TX

相关推荐

最近更新

标签

通过正则表达式从 Oracle 中的字符串中删除简单的 HTML 标签，需要解释

提问by Basti

回答by Olivier Jacot-Descombes

回答by DevilPinky

回答by A_Developer_in_Austin_TX

相关推荐

oracle 如何在oracle中提取或更新xml属性值

将 Oracle 序列重置为 MIN VALUE = 1 和 STARTING number from 1

oracle hibernate.jpa.criteria.BasicPathUsageException：无法加入基本类型的属性

Oracle REST 数据服务 apex_pu

相关推荐

最近更新

标签