如何在 Python 中将一个字符串附加到另一个字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4435169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I append one string to another in Python?
提问by user469652
I want an efficient way to append one string to another in Python, other than the following.
我想要一种在 Python 中将一个字符串附加到另一个字符串的有效方法,而不是以下内容。
var1 = "foo"
var2 = "bar"
var3 = var1 + var2
Is there any good built-in method to use?
有什么好的内置方法可以使用吗?
采纳答案by John La Rooy
If you only have one reference to a string and you concatenate another string to the end, CPython now special cases this and tries to extend the string in place.
如果您只有一个对字符串的引用,并且将另一个字符串连接到末尾,则 CPython 现在对此进行特殊处理并尝试将字符串扩展到位。
The end result is that the operation is amortized O(n).
最终结果是该操作被摊销了 O(n)。
e.g.
例如
s = ""
for i in range(n):
s+=str(i)
used to be O(n^2), but now it is O(n).
以前是 O(n^2),但现在是 O(n)。
From the source (bytesobject.c):
从源(bytesobject.c):
void
PyBytes_ConcatAndDel(register PyObject **pv, register PyObject *w)
{
PyBytes_Concat(pv, w);
Py_XDECREF(w);
}
/* The following function breaks the notion that strings are immutable:
it changes the size of a string. We get away with this only if there
is only one module referencing the object. You can also think of it
as creating a new string object and destroying the old one, only
more efficiently. In any case, don't use this if the string may
already be known to some other part of the code...
Note that if there's not enough memory to resize the string, the original
string object at *pv is deallocated, *pv is set to NULL, an "out of
memory" exception is set, and -1 is returned. Else (on success) 0 is
returned, and the value in *pv may or may not be the same as on input.
As always, an extra byte is allocated for a trailing $ python -m timeit -s"s=''" "for i in xrange(10):s+='a'"
1000000 loops, best of 3: 1.85 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(100):s+='a'"
10000 loops, best of 3: 16.8 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
10000 loops, best of 3: 158 usec per loop
$ python -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
1000 loops, best of 3: 1.71 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 14.6 msec per loop
$ python -m timeit -s"s=''" "for i in xrange(1000000):s+='a'"
10 loops, best of 3: 173 msec per loop
byte (newsize
does *not* include that), and a trailing $ pypy -m timeit -s"s=''" "for i in xrange(10):s+='a'"
10000 loops, best of 3: 90.8 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(100):s+='a'"
1000 loops, best of 3: 896 usec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(1000):s+='a'"
100 loops, best of 3: 9.03 msec per loop
$ pypy -m timeit -s"s=''" "for i in xrange(10000):s+='a'"
10 loops, best of 3: 89.5 msec per loop
byte is stored.
*/
int
_PyBytes_Resize(PyObject **pv, Py_ssize_t newsize)
{
register PyObject *v;
register PyBytesObject *sv;
v = *pv;
if (!PyBytes_Check(v) || Py_REFCNT(v) != 1 || newsize < 0) {
*pv = 0;
Py_DECREF(v);
PyErr_BadInternalCall();
return -1;
}
/* XXX UNREF/NEWREF interface should be more symmetrical */
_Py_DEC_REFTOTAL;
_Py_ForgetReference(v);
*pv = (PyObject *)
PyObject_REALLOC((char *)v, PyBytesObject_SIZE + newsize);
if (*pv == NULL) {
PyObject_Del(v);
PyErr_NoMemory();
return -1;
}
_Py_NewReference(*pv);
sv = (PyBytesObject *) *pv;
Py_SIZE(sv) = newsize;
sv->ob_sval[newsize] = '$ pypy -m timeit -s"s=''" "for i in xrange(100000):s+='a'"
10 loops, best of 3: 12.8 sec per loop
';
sv->ob_shash = -1; /* invalidate cached hash value */
return 0;
}
It's easy enough to verify empirically.
凭经验验证很容易。
str1 = "Hello"
str2 = "World"
newstr = " ".join((str1, str2))
It's importanthowever to note that this optimisation isn't part of the Python spec. It's only in the cPython implementation as far as I know. The same empirical testing on pypy or jython for example might show the older O(n**2) performance .
然而,重要的是要注意这种优化不是 Python 规范的一部分。据我所知,它仅在 cPython 实现中。例如,对 pypy 或 jython 的相同经验测试可能会显示较旧的 O(n**2) 性能。
s = 'foo'
s += 'bar'
s += 'baz'
So far so good, but then,
到目前为止一切顺利,但后来,
l = []
l.append('foo')
l.append('bar')
l.append('baz')
s = ''.join(l)
ouch even worse than quadratic. So pypy is doing something that works well with short strings, but performs poorly for larger strings.
哎哟比二次方还差。所以 pypy 正在做一些对短字符串很有效的事情,但对较大的字符串表现不佳。
回答by Laurence Gonsalves
If you need to do many append operations to build a large string, you can use StringIOor cStringIO. The interface is like a file. ie: you writeto append text to it.
如果你需要做很多追加操作来构建一个大字符串,你可以使用StringIO或 cStringIO。界面就像一个文件。即:您write将文本附加到它。
If you're just appending two strings then just use +.
如果您只是附加两个字符串,则只需使用+.
回答by Rafe Kettler
%%timeit
x = []
for i in range(100000000): # xrange on Python 2.7
x.append('a')
x = ''.join(x)
That joins str1 and str2 with a space as separators. You can also do "".join(str1, str2, ...). str.join()takes an iterable, so you'd have to put the strings in a list or a tuple.
这将 str1 和 str2 以空格作为分隔符。你也可以这样做"".join(str1, str2, ...)。str.join()需要一个可迭代的,所以你必须把字符串放在一个列表或一个元组中。
That's about as efficient as it gets for a builtin method.
这与内置方法的效率差不多。
回答by John Kugelman
Don't prematurely optimize. If you have no reason to believe there's a speed bottleneck caused by string concatenations then just stick with +and +=:
不要过早地优化。如果您没有理由相信字符串连接会导致速度瓶颈,那么请坚持使用+and +=:
%%timeit
x = ''
for i in range(100000000): # xrange on Python 2.7
x += 'a'
That said, if you're aiming for something like Java's StringBuilder, the canonical Python idiom is to add items to a list and then use str.jointo concatenate them all at the end:
也就是说,如果您的目标是 Java 的 StringBuilder 之类的东西,那么规范的 Python 习惯用法是将项目添加到列表中,然后str.join在最后将它们全部连接起来:
a='foo'
b='baaz'
a.__add__(b)
out: 'foobaaz'
回答by Winston Ewert
Don't.
别。
That is, for most cases you are better off generating the whole string in one go rather then appending to an existing string.
也就是说,在大多数情况下,最好一次性生成整个字符串,而不是附加到现有字符串。
For example, don't do: obj1.name + ":" + str(obj1.count)
例如,不要这样做: obj1.name + ":" + str(obj1.count)
Instead: use "%s:%d" % (obj1.name, obj1.count)
相反:使用 "%s:%d" % (obj1.name, obj1.count)
That will be easier to read and more efficient.
这将更容易阅读和更有效率。
回答by Ramy
it really depends on your application. If you're looping through hundreds of words and want to append them all into a list, .join()is better. But if you're putting together a long sentence, you're better off using +=.
这真的取决于你的应用程序。如果您要遍历数百个单词并希望将它们全部附加到列表中,.join()那就更好了。但是如果你把一个长句子放在一起,你最好使用+=.
回答by ostrokach
Basically, no difference. The only consistent trend is that Python seems to be getting slower with every version... :(
基本上没有区别。唯一一致的趋势是,每个版本的 Python 似乎都变慢了...... :(
List
列表
str = "Hello"
str2 = " World"
st = str.__add__(str2)
print(st)
Python 2.7
蟒蛇 2.7
1 loop, best of 3: 7.34s per loop
1 个循环,最好的 3 个:每个循环7.34秒
Python 3.4
蟒蛇 3.4
1 loop, best of 3: 7.99s per loop
1 个循环,最好的 3 个:每个循环7.99秒
Python 3.5
蟒蛇 3.5
1 loop, best of 3: 8.48s per loop
1 个循环,最好的 3 个:每个循环8.48秒
Python 3.6
蟒蛇 3.6
1 loop, best of 3: 9.93s per loop
1 个循环,最好的 3 个:每个循环9.93秒
String
细绳
Hello World
Python 2.7:
蟒蛇2.7:
1 loop, best of 3: 7.41 sper loop
1 个循环,最好的 3 个:每个循环7.41 秒
Python 3.4
蟒蛇 3.4
1 loop, best of 3: 9.08s per loop
1 个循环,最好的 3 个:每个循环9.08秒
Python 3.5
蟒蛇 3.5
1 loop, best of 3: 8.82s per loop
1 个循环,最好的 3 个:每个循环8.82秒
Python 3.6
蟒蛇 3.6
1 loop, best of 3: 9.24s per loop
1 个循环,最好的 3 个:每个循环9.24秒
回答by Rahul Shrivastava
var1 = "foo"
var2 = "bar"
var3 = f"{var1}{var2}"
print(var3) # prints foobar
回答by Sai Gopi N
append strings with __add__function
使用__add__函数追加字符串
print(f"1 + 1 == {1 + 1}") # prints 1 + 1 == 2
Output
输出
##代码##
