string 从 R 中的字符串中删除某些字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15170250/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-09 01:49:38  来源:igfitidea点击:

Removing certain characters from a string in R

stringr

提问by Ryan Warnick

I have a string in R which contains a large amount of words. When viewing the string I get a large amount of text which includes text similar to the following:

我在 R 中有一个包含大量单词的字符串。查看字符串时,我收到大量文本,其中包含类似于以下内容的文本:

>docs

....

\u009cYes yes for ever for ever the boys cried in their ringing voices with softened faces

....

So I'm wondering how to remove these \u009 characters (all of them, some of which have slightly different numbers) from the string. I've tried using gsub(), but that wasn't effective in removing the content from the strings.

所以我想知道如何从字符串中删除这些 \u009 字符(所有这些字符,其中一些的数字略有不同)。我试过使用gsub(),但这不能有效地从字符串中删除内容。

回答by agstudy

This should work

这应该工作

gsub('\u009c','','\u009cYes yes for ever for ever the boys ')
"Yes yes for ever for ever the boys "

Here 009c is the hexadecimal number of unicode. You must always specify 4 hexadecimal digits. If you have many , one solution is to separate them by a pipe:

这里 009c 是 unicode 的十六进制数。您必须始终指定 4 个十六进制数字。如果您有 many ,一种解决方案是用管道将它们分开:

gsub('\u009c|\u00F0','','\u009cYes yes \u00F0for ever for ever the boys and the girls')

"Yes yes for ever for ever the boys and the girls"

回答by Nic

try: gsub('\\$', '', '$5.00$')

尝试: gsub('\\$', '', '$5.00$')