java 如何从字符串中删除特定的特殊字符模式

Question

提问by Roshanck

I have a string name s,

我有一个字符串名称 s，

String s = "<NOUN>Sam</NOUN> , a student of the University of oxford , won the Ethugalpura International Rating Chess Tournament which concluded on Dec.22 at the Blue Olympiad Hotel";

I want to remove all <NOUN> and </NOUN> tags from the string. I used this to remove tags,

我想从字符串中删除所有 < NOUN> 和 < /NOUN> 标签。我用它来删除标签，

s.replaceAll("[<NOUN>,</NOUN>]","");

Yes it removes the tag. but it also removes letter 'U' and 'O' characters from the stringwhich gives me following output.

是的，它删除了标签。但它也从字符串中删除了字母 'U' 和 'O' 字符，这给了我以下输出。

 Sam , a student of the niversity of oxford , won the Ethugalpura International Rating Chess Tournament which concluded on Dec.22 at the Blue lympiad Hotel

Can anyone please tell me how to do this correctly?

谁能告诉我如何正确地做到这一点？

Answer 1

回答by Hubro

Try:

尝试：

s.replaceAll("<NOUN>|</NOUN>", "");

In RegEx, the syntax [...]will match every characterinside the brackets, regardless of the order they appear in. Therefore, in your example, all appearances of "<", "N", "O" etc. are removed. Instead use the pipe (|) to match both "<NOUN>" and "</NOUN>".

在 RegEx 中，语法[...]将匹配括号内的每个字符，无论它们出现的顺序如何。因此，在您的示例中，“<”、“N”、“O”等的所有出现都被删除。而是使用管道 ( |) 来匹配“<NOUN>”和“</NOUN>”。

The following should also work (and could be considered more DRY and elegant) since it will match the tag both with and without the forward slash:

以下也应该有效（并且可以被认为更 DRY 和优雅），因为它会匹配带有和不带有正斜杠的标签：

s.replaceAll("</?NOUN>", "");

Answer 2

回答by Brian Agnew

String.replaceAll() takes a regular expression as its first argument. The regexp:

String.replaceAll() 将正则表达式作为其第一个参数。正则表达式：

"[<NOUN>,</NOUN>]"

defines within the brackets the set of charactersto be identified and thus removed. Thus you're asking to remove the characters <,>,/,N,O,Uand comma.

在括号内定义要识别并因此删除的字符集。因此，您要求删除字符<, >, /, N, O,U和逗号。

Perhaps the simplestmethod to do what you want is to do:

也许做你想做的最简单的方法是：

s.replaceAll("<NOUN>","").replaceAll("</NOUN>","");

which is explicit in what it's removing. More complex regular expressions are obviously possible.

这在它删除的内容中很明确。更复杂的正则表达式显然是可能的。

Answer 3

回答by Timo Hahn

You can use one regular expression for this: "<[/]*NOUN>" so

您可以为此使用一个正则表达式："<[/]*NOUN>" 所以

s.replaceAll("<[/]*NOUN>","");

should do the trick. The "[/]*" matches zero or more "/" after the "<".

应该做的伎俩。“[/]*”与“<”后的零个或多个“/”匹配。

Answer 4

回答by abdelhadi

Try this :String result = originValue.replaceAll("\\<.*?>", "");

试试这个：String result = originValue.replaceAll("\\<.*?>", "");

java 如何从字符串中删除特定的特殊字符模式

提问by Roshanck

回答by Hubro

回答by Brian Agnew

回答by Timo Hahn

回答by abdelhadi

相关推荐

最近更新

标签

java 如何从字符串中删除特定的特殊字符模式

提问by Roshanck

回答by Hubro

回答by Brian Agnew

回答by Timo Hahn

回答by abdelhadi

相关推荐

java 从模型下载文件并在春季查看

如何在 Java 中制定 NOR 运算符

java JPA Criteria Api 选择列为空的对象

java 如何将 wsdl 中定义的 Soap Header 添加到 CXF 中的 Web 服务客户端？

相关推荐

最近更新

标签