将常规 Python 字符串转换为原始字符串
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4415259/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Convert regular Python string to raw string
提问by rectangletangle
I have a string s, its contents are variable. I'd like to make it a raw string. How do I go about this?
我有一个字符串s,它的内容是可变的。我想把它变成一个原始字符串。我该怎么做?
Something similar to the r''method.
类似于r''方法的东西。
回答by SingleNegationElimination
raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.
原始字符串仅适用于字符串文字。它们的存在是为了您可以更方便地表达将被转义序列处理修改的字符串。这在用字符串文字写出正则表达式或其他形式的代码时特别有用。如果你想要一个没有转义处理的 unicode 字符串,只需在它前面加上前缀ur,比如ur'somestring'.
回答by Karl Knechtel
Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.
原始字符串不是另一种字符串。它们是在源代码中描述字符串的不同方式。一旦创建了字符串,它就是它的样子。
回答by Jolly1234
i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':
我相信你正在寻找的是 str.encode("string-escape") 函数。例如,如果您有一个想要“原始字符串”的变量:
a = '\x89'
a.encode('unicode_escape')
'\x89'
Note: Use string-escapefor python 2.x and older versions
注意:string-escape用于 python 2.x 和旧版本
I was searching for a similar solution and found the solution via: casting raw strings python
我正在寻找类似的解决方案,并通过以下方式找到了解决方案: cast raw strings python
回答by rjurney
For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:
对于Python 3,顺便可以做到这一点不添加双反斜线和简单的蜜饯\n,\t等是:
a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\', '\')
print(a)
Which gives a value that can be written as CSV:
这给出了一个可以写为 CSV 的值:
hello\nbobby\nsally\n
There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.
其他特殊字符似乎没有解决方案,但是,在它们之前可能会有一个 \ 。这是一个无赖。解决这个问题会很复杂。
For example, to serialize a pandas.Seriescontaining a list of strings with special characters in to a textfile in the format BERTexpects with a CR between each sentence and a blank line between each document:
例如,要将pandas.Series包含具有特殊字符的字符串列表的 a序列化为BERT期望的格式的文本文件,每个句子之间有一个 CR,每个文档之间有一个空行:
with open('sentences.csv', 'w') as f:
current_idx = 0
for idx, doc in sentences.items():
# Insert a newline to separate documents
if idx != current_idx:
f.write('\n')
# Write each sentence exactly as it appared to one line each
for sentence in doc:
f.write(sentence.encode('unicode-escape').decode().replace('\\', '\') + '\n')
This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):
此输出(对于标记为句子的所有语言的 Github CodeSearchNet 文档字符串):
Makes sure the fast-path emits in order.
@param value the value to emit or queue up\n@param delayError if true, errors are delayed until the source has terminated\n@param disposable the resource to dispose if the drain terminates
Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{@code amb} does not operate by default on a particular {@link Scheduler}.
@param the common element type\n@param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
@return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n@see ReactiveX operators documentation: Amb
...
回答by slashCoder
Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a newraw string from s, like this:
由于 Python 中的字符串是不可变的,因此您不能“使它”有任何不同。但是,您可以从 中创建一个新的原始字符串s,如下所示:
raw_s = r'{}'.format(s)
raw_s = r'{}'.format(s)
回答by dheinz
As of Python 3.6, you can use the following (similar to @slashCoder):
从 Python 3.6 开始,您可以使用以下内容(类似于 @slashCoder):
def to_raw(string):
return fr"{string}"
my_dir ="C:\data\projects"
to_raw(my_dir)
yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.
产量'C:\\data\\projects'。我在 Windows 10 机器上使用它来将目录传递给函数。

