string 忽略 R 字符串中的转义字符(反斜杠)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4685737/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Ignore escape characters (backslashes) in R strings
提问by mhermans
While running an R-plugin in SPSS, I receive a Windows path string as input e.g.
在 SPSS 中运行 R 插件时,我收到一个 Windows 路径字符串作为输入,例如
'C:\Users\mhermans\somefile.csv'
I would like to use that path in subsequent R code, but then the slashes need to be replaced with forward slashes, otherwise R interprets it as escapes (eg. "\U used without hex digits" errors).
我想在后续的 R 代码中使用该路径,但是斜杠需要用正斜杠替换,否则 R 将其解释为转义符(例如“\U used without hex digits”错误)。
I have however not been able to find a function that can replace the backslashes with foward slashes or double escape them. All those functions assume those characters are escaped.
但是,我无法找到可以用正斜杠替换反斜杠或对它们进行双重转义的函数。所有这些函数都假设这些字符被转义了。
So, is there something along the lines of:
那么,是否有类似的东西:
>gsub('\', '/', 'C:\Users\mhermans')
C:/Users/mhermans
采纳答案by Sacha Epskamp
You can try to use the 'allowEscapes' argument in scan()
您可以尝试在 scan() 中使用“allowEscapes”参数
X=scan(what="character",allowEscapes=F)
C:\Users\mhermans\somefile.csv
print(X)
[1] "C:\Users\mhermans\somefile.csv"
回答by IRTFM
First you need to get it assigned to a name:
首先,您需要为其分配一个名称:
pathname <- 'C:\Users\mhermans\somefile.csv'
Notice that in order to get it into a name vector you needed to double them all, which gives a hint about how you could use regex. Actually, if you read it in from a text file, then R will do all the doubling for you. Mind you it not reallydoubling the backslashes. It is being stored as a single backslash, but it's being displayed like that and needs to be input like that from the console. Otherwise the R interpreter tries (and often fails) to turn it into a special character. And to compound the problem, regex uses the backslash as an escape as well. So to detect an escape with grep or sub or gsub you need to quadruple the backslashes
请注意,为了将其放入名称向量中,您需要将它们全部加倍,这提供了有关如何使用正则表达式的提示。实际上,如果您从文本文件中读取它,那么 R 将为您完成所有加倍处理。请注意,它并没有真正将反斜杠加倍。它被存储为单个反斜杠,但它是这样显示的,需要从控制台输入。否则,R 解释器会尝试(并且经常失败)将其转换为特殊字符。为了使问题更加复杂,正则表达式也使用反斜杠作为转义符。因此,要使用 grep 或 sub 或 gsub 检测转义,您需要将反斜杠四倍
gsub("\\", "/", pathname)
# [1] "C:/Users/mhermans/somefile.csv"
You needed to doubly "double" the backslashes. The first of each couple of \'s is to signal to the grep machine that what next comes is a literal.
您需要将反斜杠加倍“加倍”。每对 \'s 中的第一个是向 grep 机器发出信号,接下来是文字。
Consider:
考虑:
nchar("\A")
# returns `[1] 2`
回答by bill_080
If file E:\Data\junk.txt contains the following text (without quotes): C:\Users\mhermans\somefile.csv
如果文件 E:\Data\junk.txt 包含以下文本(不带引号):C:\Users\mhermans\somefile.csv
You may get a warning with the following statement, but it will work:
您可能会收到以下语句的警告,但它会起作用:
texinp <- readLines("E:\Data\junk.txt")
If file E:\Data\junk.txt contains the following text (with quotes): "C:\Users\mhermans\somefile.csv"
如果文件 E:\Data\junk.txt 包含以下文本(带引号):“C:\Users\mhermans\somefile.csv”
The above readlines statement might also give you a warning, but will now contain:
上面的 readlines 语句也可能会给你一个警告,但现在将包含:
"\"C:\Users\mhermans\somefile.csv\""
"\"C:\Users\mhermans\somefile.csv\""
So, to get what you want, make sure there aren't quotes in the incoming file, and use:
因此,要获得您想要的内容,请确保传入文件中没有引号,然后使用:
texinp <- suppressWarnings(readLines("E:\Data\junk.txt"))
回答by drf
As of version 4.0, introduced in April 2020, R provides a syntax for specifying raw strings. The string in the example can be written as:
从2020 年 4 月推出的4.0 版开始,R 提供了用于指定原始字符串的语法。例子中的字符串可以写成:
path <- r"(C:\Users\mhermans\somefile.csv)"
From ?Quotes
:
来自?Quotes
:
Raw character constants are also available using a syntax similar to the one used in C++: r"(...)" with ... any character sequence, except that it must not contain the closing sequence )". The delimiter pairs [] and {} can also be used, and R can be used in place of r. For additional flexibility, a number of dashes can be placed between the opening quote and the opening delimiter, as long as the same number of dashes appear between the closing delimiter and the closing quote.
原始字符常量也可以使用类似于 C++ 中使用的语法: r"(...)" with ... 任何字符序列,除了它不能包含结束序列 )"。分隔符对 []和 {} 也可以使用,R 可以用来代替 r。为了增加灵活性,可以在开始引号和开始分隔符之间放置许多破折号,只要在结束引号之间出现相同数量的破折号分隔符和结束引号。