Python .join 或字符串连接
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4166665/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python .join or string concatenation
提问by Falmarri
I realise that if you have an iterable you should always use .join(iterable)instead of for x in y: str += x. But if there's only a fixed number of variables that aren't already in an iterable, is using .join()still the recommended way?
我意识到如果你有一个可迭代的,你应该总是使用.join(iterable)而不是for x in y: str += x. 但是,如果只有固定数量的变量不在可迭代对象中,那么使用.join()仍然是推荐的方式吗?
For example I have
例如我有
user = 'username'
host = 'host'
should I do
我应该做
ret = user + '@' + host
or
或者
ret = '@'.join([user, host])
I'm not so much asking from a performance point of view, since both will be pretty trivial. But I've read people on here say always use .join()and I was wondering if there's any particular reason for that or if it's just generally a good idea to use .join().
从性能的角度来看,我并没有提出太多要求,因为两者都非常简单。但是我读过这里的人说总是使用.join(),我想知道是否有任何特殊原因,或者使用.join().
采纳答案by Thomas Wouters
If you're creating a string like that, you normally want to use string formatting:
如果您要创建这样的字符串,通常要使用字符串格式:
>>> user = 'username'
>>> host = 'host'
>>> '%s@%s' % (user, host)
'username@host'
Python 2.6 added another form, which doesn't rely on operator overloading and has some extra features:
Python 2.6 添加了另一种形式,它不依赖于运算符重载并具有一些额外的功能:
>>> '{0}@{1}'.format(user, host)
'username@host'
As a general guideline, most people will use +on strings only if they're adding two strings right there. For more parts or more complex strings, they either use string formatting, like above, or assemble elements in a list and join them together (especially if there's any form of looping involved.) The reason for using str.join()is that adding strings together means creating a new string (and potentially destroying the old ones) for each addition. Python can sometimes optimize this away, but str.join()quickly becomes clearer, more obvious and significantly faster.
作为一般准则,大多数人+只有在字符串上添加两个字符串时才会使用字符串。对于更多的部分或更复杂的字符串,它们要么使用字符串格式,如上,要么将元素组合在一个列表中并将它们连接在一起(特别是如果涉及任何形式的循环。)使用的原因str.join()是将字符串添加在一起意味着创建一个每次添加的新字符串(并可能破坏旧字符串)。Python 有时可以将其优化掉,但str.join()很快就会变得更清晰、更明显且速度更快。
回答by anti_social
I use next:
我接下来使用:
ret = '%s@%s' % (user, host)
回答by Glenn Maynard
(I'm pretty sure all of the people pointing at string formatting are missing the question entirely.)
(我很确定所有指向字符串格式的人都完全忽略了这个问题。)
Creating a string by constructing an array and joining it is for performance reasons only. Unless you need that performance, or unless it happens to be the natural way to implement it anyway, there's no benefit to doing that rather than simple string concatenation.
通过构造一个数组并加入它来创建一个字符串只是出于性能原因。除非您需要这种性能,或者除非它恰好是实现它的自然方式,否则与简单的字符串连接相比,这样做没有任何好处。
Saying '@'.join([user, host])is unintuitive. It makes me wonder: why is he doing this? Are there any subtleties to it; is there any case where there might be more than one '@'? The answer is no, of course, but it takes more time to come to that conclusion than if it was written in a natural way.
说法'@'.join([user, host])不直观。这让我想知道:他为什么要这样做?它有什么微妙之处吗?是否有可能有多个“@”的情况?答案当然是否定的,但得出这个结论比用自然的方式写出来需要更多的时间。
Don't contort your code merely to avoid string concatenation; there's nothing inherently wrong with it. Joining arrays is just an optimization.
不要仅仅为了避免字符串连接而扭曲你的代码;它没有本质上的问题。加入数组只是一种优化。
回答by Nick Perkins
I take the question to mean: "Is it ok to do this:"
我认为这个问题的意思是:“这样做可以吗:”
ret = user + '@' + host
..and the answer is yes. That is perfectly fine.
..答案是肯定的。那完全没问题。
You should, of course, be aware of the cool formatting stuff you can do in Python, and you should be aware that for long lists, "join" is the way to go, but for a simple situation like this, what you have is exactly right. It's simple and clear, and performance will not be an issue.
当然,你应该知道你可以在 Python 中做很酷的格式化的东西,你应该知道对于长列表,“加入”是要走的路,但对于像这样的简单情况,你所拥有的是非常正确。它简单明了,性能不会成为问题。
回答by Matthew
I'll just note that I've always tended to use in-place concatenation until I was rereading a portion of the Python general style PEP PEP-8 Style Guide for Python Code.
我只是要注意,在我重新阅读Python 代码的 Python 通用样式 PEP PEP-8 样式指南的一部分之前,我一直倾向于使用就地连接。
- Code should be written in a way that does not disadvantage other implementations of Python (PyPy, Jython, IronPython, Pyrex, Psyco, and such). For example, do not rely on CPython's efficient implementation of in-place string concatenation for statements in the form a+=b or a=a+b. Those statements run more slowly in Jython. In performance sensitive parts of the library, the ''.join() form should be used instead. This will ensure that concatenation occurs in linear time across various implementations.
- 代码的编写方式不应损害 Python 的其他实现(PyPy、Jython、IronPython、Pyrex、Psyco 等)。例如,不要依赖 CPython 对 a+=b 或 a=a+b 形式的语句的就地字符串连接的高效实现。这些语句在 Jython 中运行得更慢。在库的性能敏感部分,应该使用 ''.join() 形式。这将确保串联在各种实现中以线性时间发生。
Going by this, I have been converting to the practice of using joins so that I may retain the habit as a more automatic practice when efficiency is extra critical.
为此,我一直在转变为使用连接的做法,以便在效率尤为关键时,我可以将这种习惯保留为一种更自动化的做法。
So I'll put in my vote for:
所以我要投赞成票:
ret = '@'.join([user, host])
回答by ivanleoncz
I recommend join()over concatenation, based on two aspects:
我建议join()过度串联,基于两个方面:
- Faster.
- More elegant.
- 快点。
- 更优雅。
Regarding the first aspect, here's an example:
关于第一方面,这里有一个例子:
import timeit
s1 = "Flowers"
s2 = "of"
s3 = "War"
def join_concat():
return s1 + " " + s2 + " " + s3
def join_builtin():
return " ".join((s1, s2, s3))
print("Join Concatenation: ", timeit.timeit(join_concat))
print("Join Builtin: ", timeit.timeit(join_builtin))
The output:
输出:
$ python3 join_test.py
Join Concatenation: 0.40386943198973313
Join Builtin: 0.2666833929979475
Considering a huge dataset (millions of lines) and its processing, 130 milliseconds per line, it's too much.
考虑到一个巨大的数据集(数百万行)及其处理,每行 130 毫秒,这太多了。
And for the second aspect, indeed, is more elegant.
而对于第二个方面,确实更优雅。

