Python 使用 Pandas 为字符串列中的每个值添加字符串前缀

Question

提问by TheChymera

I would like to append a string to the start of each value in a said column of a pandas dataframe (elegantly). I already figured out how to kind-of do this and I am currently using:

我想在 Pandas 数据帧的所述列中的每个值的开头附加一个字符串（优雅）。我已经想出了如何做到这一点，我目前正在使用：

df.ix[(df['col'] != False), 'col'] = 'str'+df[(df['col'] != False), 'col']

This seems one hell of an inelegant thing to do - do you know any other way (which maybe also adds the character to rows where that column is 0 or NaN)?

这似乎是一件非常不雅的事情 - 您知道其他任何方式吗（这可能还会将字符添加到该列为 0 或 NaN 的行中）？

In case this is yet unclear, I would like to turn:

如果这还不清楚，我想转：

    col 
1     a
2     0

into:

进入：

       col 
1     stra
2     str0

Answer 1

采纳答案by Roman Pekar

df['col'] = 'str' + df['col'].astype(str)

Example:

例子：

>>> df = pd.DataFrame({'col':['a',0]})
>>> df
  col
0   a
1   0
>>> df['col'] = 'str' + df['col'].astype(str)
>>> df
    col
0  stra
1  str0

Answer 2

回答by Cleb

As an alternative, you can also use an applycombined with format(or better with f-strings) which I find slightly more readable if one e.g. also wants to add a suffix or manipulate the element itself:

作为替代方案，您还可以使用apply与format（或更好地与 f-strings）结合使用，如果一个人还想添加后缀或操纵元素本身，我发现它的可读性稍强：

df = pd.DataFrame({'col':['a', 0]})

df['col'] = df['col'].apply(lambda x: "{}{}".format('str', x))

which also yields the desired output:

这也产生了所需的输出：

    col
0  stra
1  str0

If you are using Python 3.6+, you can also use f-strings:

如果您使用的是 Python 3.6+，您还可以使用 f-strings：

df['col'] = df['col'].apply(lambda x: f"str{x}")

yielding the same output.

产生相同的输出。

The f-string version is almost as fast as @RomanPekar's solution (python 3.6.4):

f-string 版本几乎与@RomanPekar 的解决方案（python 3.6.4）一样快：

df = pd.DataFrame({'col':['a', 0]*200000})

%timeit df['col'].apply(lambda x: f"str{x}")
117 ms ± 451 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

%timeit 'str' + df['col'].astype(str)
112 ms ± 1.04 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Using format, however, is indeed far slower:

format但是，使用确实要慢得多：

%timeit df['col'].apply(lambda x: "{}{}".format('str', x))
185 ms ± 1.07 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Answer 3

回答by Vasyl Vaskivskyi

If you load you table file with dtype=str
or convert column type to string df['a'] = df['a'].astype(str)
then you can use such approach:

如果您加载表文件dtype=str
或将列类型转换为字符串，df['a'] = df['a'].astype(str)
则可以使用这种方法：

df['a']= 'col' + df['a'].str[:]

This approach allows prepend, append, and subset string of df.
Works on Pandas v0.23.4, v0.24.1. Don't know about earlier versions.

这种方法允许在df.
适用于 Pandas v0.23.4、v0.24.1。不知道早期版本。

Answer 4

回答by Lukas

Another solution with .loc:

.loc 的另一个解决方案：

df = pd.DataFrame({'col': ['a', 0]})
df.loc[df.index, 'col'] = 'string' + df['col'].astype(str)

This is not as quick as solutions above (>1ms per loop slower) but may be useful in case you need conditional change, like:

这不像上面的解决方案那么快（每个循环慢> 1ms），但在您需要条件更改的情况下可能很有用，例如：

mask = (df['col'] == 0)
df.loc[mask, 'col'] = 'string' + df['col'].astype(str)

Answer 5

回答by Boxtell

You can use pandas.Series.map :

您可以使用 pandas.Series.map ：

df['col'].map('str{}'.format)

It will apply the word "str" before all your values.

它将在所有值之前应用“str”一词。

Python 使用 Pandas 为字符串列中的每个值添加字符串前缀

提问by TheChymera

采纳答案by Roman Pekar

回答by Cleb

回答by Vasyl Vaskivskyi

回答by Lukas

回答by Boxtell

相关推荐

最近更新

标签

Python 使用 Pandas 为字符串列中的每个值添加字符串前缀

提问by TheChymera

采纳答案by Roman Pekar

回答by Cleb

回答by Vasyl Vaskivskyi

回答by Lukas

回答by Boxtell

相关推荐

在 Python 中，我如何知道进程何时完成？

Python 如何创建一个旋转的命令行光标？

正则表达式上的Python拆分字符串

Python 如何更改 Windows 文件的文件创建日期？

相关推荐

最近更新

标签