windows Python 原始字符串和尾随反斜杠

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2870730/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-15 14:29:22  来源:igfitidea点击:

Python raw strings and trailing backslash

pythonwindowsescaping

提问by dash-tom-bang

I ran across something once upon a time and wondered if it was a Python "bug" or at least a misfeature. I'm curious if anyone knows of any justifications for this behavior. I thought of it just now reading "Code Like a Pythonista," which has been enjoyable so far. I'm only familiar with the 2.x line of Python.

曾几何时,我遇到了一些事情,想知道它是否是 Python 的“错误”或至少是错误的功能。我很好奇是否有人知道这种行为的任何理由。我想到了刚刚阅读“像 Pythonista 一样的代码”,到目前为止一直很有趣。我只熟悉 Python 的 2.x 行。

Raw strings are strings that are prefixed with an r. This is great because I can use backslashes in regular expressions and I don't need to double everything everywhere. It's also handy for writing throwaway scripts on Windows, so I can use backslashes there also. (I know I can also use forward slashes, but throwaway scripts often contain content cut&pasted from elsewhere in Windows.)

原始字符串是以r.为前缀的字符串。这很棒,因为我可以在正则表达式中使用反斜杠,而且我不需要在任何地方都加倍。在 Windows 上编写一次性脚本也很方便,所以我也可以在那里使用反斜杠。(我知道我也可以使用正斜杠,但一次性脚本通常包含从 Windows 其他地方剪切和粘贴的内容。)

So great! Unless, of course, you really want your string to end with a backslash. There's no way to do that in a 'raw' string.

很好!当然,除非您真的希望字符串以反斜杠结尾。在“原始”字符串中无法做到这一点。

In [9]: r'\n'
Out[9]: '\n'

In [10]: r'abc\n'
Out[10]: 'abc\n'

In [11]: r'abc\'
------------------------------------------------
   File "<ipython console>", line 1
     r'abc\'
           ^
SyntaxError: EOL while scanning string literal


In [12]: r'abc\'
Out[12]: 'abc\\'

So one backslash before the closing quote is an error, but two backslashes gives you two backslashes! Certainly I'm not the only one that is bothered by this?

所以在结束引号之前的一个反斜杠是一个错误,但是两个反斜杠会给你两个反斜杠!当然,我不是唯一一个为此烦恼的人吗?

Thoughts on why 'raw' strings are 'raw, except for backslash-quote'? I mean, if I wanted to embed a single quote in there I'd just use double quotes around the string, and vice versa. If I wanted both, I'd just triple quote. If I really wanted three quotes in a row in a raw string, well, I guess I'd have to deal, but is this considered "proper behavior"?

关于为什么“原始”字符串是“原始的,除了反斜杠引号”的想法?我的意思是,如果我想在其中嵌入单引号,我只会在字符串周围使用双引号,反之亦然。如果我想要两者,我只会三重引用。如果我真的想在原始字符串中连续使用三个引号,那么我想我必须处理,但这是否被视为“正确行为”?

This is particularly problematic with folder names in Windows, where the backslash is the path delimeter.

这对于 Windows 中的文件夹名称尤其成问题,其中反斜杠是路径分隔符。

采纳答案by John Machin

It's a FAQ.

这是一个常见问题解答

And in response to "you really want your string to end with a backslash. There's no way to do that in a 'raw' string.": the FAQ shows how to workaround it.

并且响应“您真的希望您的字符串以反斜杠结尾。在'原始'字符串中无法做到这一点。”:常见问题解答显示了如何解决它。

>>> r'ab\c' '\' == 'ab\c\'
True
>>>

回答by Alex Martelli

Raw strings are meant mostly for readably writing the patterns for regular expressions, which never need a trailing backslash; it's an accident that they may come in handy for Windows (where you could use forward slashes in most cases anyway -- the Microsoft C library which underlies Python accepts either form!). It's not cosidered acceptable to make it (nearly) impossible to write a regular expression pattern containing both single anddouble quotes, just to reinforce the accident in question.

原始字符串主要用于为正则表达式编写可读的模式,不需要尾部反斜杠;它们可能在 Windows 中派上用场是一个意外(在大多数情况下,您可以在大多数情况下使用正斜杠——作为 Python 基础的 Microsoft C 库接受任何一种形式!)。使(几乎)不可能编写包含单引号双引号的正则表达式模式是不可接受的,只是为了加强所讨论的事故。

("Nearly" because triple-quoting would almost alway help... but it could be a little bit of a pain sometimes).

(“几乎”是因为三重引用几乎总是有帮助......但有时可能会有点痛苦)。

So, yes, raw strings were designed to behave that way (forbidding odd numbers of trailing backslashes), and it isconsidered perfectly "proper behavior" for them to respect the design decisions Guido made when he invented them;-).

所以,是的,原始字符串被设计成那样的行为(禁止奇数个尾部反斜杠),并且它们认为是完全“正确的行为”,以尊重 Guido 在他发明它们时所做的设计决策;-)。

回答by GravityWell

Another way to workaround this is:

解决此问题的另一种方法是:

 >>> print(r"Raw \with\ trailing backslash\ "[:-1])
 Raw \with\ trailing backslash\

Updated for Python 3 and removed unnecessary slash at the end which implied an escape.

针对 Python 3 进行了更新,并删除了末尾暗示转义的不必要的斜线。

Note that personally I doubt I would use the above. I guess maybe if it was a huge string with more than just a path. For the above I'd prefer non-raw and double up the slashes.

请注意,我个人怀疑我会使用上述内容。我想也许它是一个巨大的字符串,不仅仅是一条路径。对于上述情况,我更喜欢非原始的并将斜杠加倍。

回答by user207421

Thoughts on why 'raw' strings are 'raw, except for backslash-quote'? I mean, if I wanted to embed a single quote in there I'd just use double quotes around the string, and vice versa.

关于为什么“原始”字符串是“原始的,除了反斜杠引号”的想法?我的意思是,如果我想在其中嵌入单引号,我只会在字符串周围使用双引号,反之亦然。

But that would then raise the question as to why raw strings are 'raw, except for embedded quotes?'

但这会引发一个问题,即为什么原始字符串是“原始的,除了嵌入的引号?”

You have to have someescape mechanism, otherwise you can never use the outer quote characters inside the string at all. And then you need an escape mechanism for the escape mechanism.

你必须有一些转义机制,否则你永远不能在字符串中使用外部引号字符。然后你需要一个逃生机制的逃生机制。