将常规 Python 字符串转换为原始字符串

Question

提问by rectangletangle

I have a string s, its contents are variable. I'd like to make it a raw string. How do I go about this?

我有一个字符串s，它的内容是可变的。我想把它变成一个原始字符串。我该怎么做？

Something similar to the r''method.

类似于r''方法的东西。

Answer 1

回答by SingleNegationElimination

raw strings apply only to string literals. they exist so that you can more conveniently express strings that would be modified by escape sequence processing. This is most especially useful when writing out regular expressions, or other forms of code in string literals. if you want a unicode string without escape processing, just prefix it with ur, like ur'somestring'.

原始字符串仅适用于字符串文字。它们的存在是为了您可以更方便地表达将被转义序列处理修改的字符串。这在用字符串文字写出正则表达式或其他形式的代码时特别有用。如果你想要一个没有转义处理的 unicode 字符串，只需在它前面加上前缀ur，比如ur'somestring'.

Answer 2

回答by Karl Knechtel

Raw strings are not a different kind of string. They are a different way of describing a string in your source code. Once the string is created, it is what it is.

原始字符串不是另一种字符串。它们是在源代码中描述字符串的不同方式。一旦创建了字符串，它就是它的样子。

Answer 3

回答by Jolly1234

i believe what you're looking for is the str.encode("string-escape") function. For example, if you have a variable that you want to 'raw string':

我相信你正在寻找的是 str.encode("string-escape") 函数。例如，如果您有一个想要“原始字符串”的变量：

a = '\x89'
a.encode('unicode_escape')
'\x89'

Note: Use string-escapefor python 2.x and older versions

注意：string-escape用于 python 2.x 和旧版本

I was searching for a similar solution and found the solution via: casting raw strings python

我正在寻找类似的解决方案，并通过以下方式找到了解决方案： cast raw strings python

Answer 4

回答by rjurney

For Python 3, the way to do this that doesn't add double backslashes and simply preserves \n, \t, etc. is:

对于Python 3，顺便可以做到这一点不添加双反斜线和简单的蜜饯\n，\t等是：

a = 'hello\nbobby\nsally\n'
a.encode('unicode-escape').decode().replace('\\', '\')
print(a)

Which gives a value that can be written as CSV:

这给出了一个可以写为 CSV 的值：

hello\nbobby\nsally\n

There doesn't seem to be a solution for other special characters, however, that may get a single \ before them. It's a bummer. Solving that would be complex.

其他特殊字符似乎没有解决方案，但是，在它们之前可能会有一个 \ 。这是一个无赖。解决这个问题会很复杂。

For example, to serialize a pandas.Seriescontaining a list of strings with special characters in to a textfile in the format BERTexpects with a CR between each sentence and a blank line between each document:

例如，要将pandas.Series包含具有特殊字符的字符串列表的 a序列化为BERT期望的格式的文本文件，每个句子之间有一个 CR，每个文档之间有一个空行：

with open('sentences.csv', 'w') as f:

    current_idx = 0
    for idx, doc in sentences.items():
        # Insert a newline to separate documents
        if idx != current_idx:
            f.write('\n')
        # Write each sentence exactly as it appared to one line each
        for sentence in doc:
            f.write(sentence.encode('unicode-escape').decode().replace('\\', '\') + '\n')

This outputs (for the Github CodeSearchNet docstrings for all languages tokenized into sentences):

此输出（对于标记为句子的所有语言的 Github CodeSearchNet 文档字符串）：

Makes sure the fast-path emits in order.
@param value the value to emit or queue up\n@param delayError if true, errors are delayed until the source has terminated\n@param disposable the resource to dispose if the drain terminates

Mirrors the one ObservableSource in an Iterable of several ObservableSources that first either emits an item or sends\na termination notification.
Scheduler:\n{@code amb} does not operate by default on a particular {@link Scheduler}.
@param  the common element type\n@param sources\nan Iterable of ObservableSource sources competing to react first.
A subscription to each source will\noccur in the same order as in the Iterable.
@return an Observable that emits the same sequence as whichever of the source ObservableSources first\nemitted an item or sent a termination notification\n@see ReactiveX operators documentation: Amb


...

Answer 5

回答by slashCoder

Since strings in Python are immutable, you cannot "make it" anything different. You can however, create a newraw string from s, like this:

由于 Python 中的字符串是不可变的，因此您不能“使它”有任何不同。但是，您可以从中创建一个新的原始字符串s，如下所示：

raw_s = r'{}'.format(s)

Answer 6

回答by dheinz

As of Python 3.6, you can use the following (similar to @slashCoder):

从 Python 3.6 开始，您可以使用以下内容（类似于 @slashCoder）：

def to_raw(string):
    return fr"{string}"

my_dir ="C:\data\projects"
to_raw(my_dir)

yields 'C:\\data\\projects'. I'm using it on a Windows 10 machine to pass directories to functions.

产量'C:\\data\\projects'。我在 Windows 10 机器上使用它来将目录传递给函数。

将常规 Python 字符串转换为原始字符串

提问by rectangletangle

回答by SingleNegationElimination

回答by Karl Knechtel

回答by Jolly1234

回答by rjurney

回答by slashCoder

回答by dheinz

相关推荐

最近更新

标签

将常规 Python 字符串转换为原始字符串

提问by rectangletangle

回答by SingleNegationElimination

回答by Karl Knechtel

回答by Jolly1234

回答by rjurney

回答by slashCoder

回答by dheinz

相关推荐

如何在python中查找并附加到二进制文件？

Python NumPy 数组的就地类型转换

如何使用 Python urlopen 获取非 ascii url？

Python 如何让过滤器与带多个参数的 lambda 一起使用？

相关推荐

最近更新

标签