pandas 数据框：从整个数据框的所有单元格值中添加和删除前缀/后缀

Question

提问by murphy1310

To add a prefix/suffix to a dataframe, I usually do the following..

要向数据帧添加前缀/后缀，我通常会执行以下操作..

For instance, to add a suffix '@',

例如，要添加后缀'@'，

df = df.astype(str) + '@'

This has basically appended a '@'to all cell values.

这基本上已将 a 附加'@'到所有单元格值。

I would like to know how to remove this suffix. Is there a method available with the pandas.DataFrame class directly that removes a particular prefix/suffix character from the entire DataFrame ?

我想知道如何删除这个后缀。pandas.DataFrame 类是否有直接从整个 DataFrame 中删除特定前缀/后缀字符的方法？

I've tried iterating through the rows (as series) while using rstrip('@')as follows:

我尝试使用以下方法遍历行（作为系列）rstrip('@')：

for index in range(df.shape[0]):
    row = df.iloc[index]
    row = row.str.rstrip('@')

Now, in order to make dataframe out of this series,

现在，为了从这个系列中制作数据框，

new_df = pd.DataFrame(columns=list(df))
new_df = new_df.append(row)

However, this doesn't work. Gives empty dataframe.

但是，这不起作用。给出空数据框。

Is there something really basic that I am missing?

有什么我缺少的非常基本的东西吗？

Answer 1

采纳答案by AlexG

You could use applymap to apply your string method to each element:

您可以使用 applymap 将字符串方法应用于每个元素：

df = df.applymap(lambda x: str(x).rstrip('@'))

Note: I wouldn't expect this to be as fast as the vectorized approach: pd.Series.str.rstripi.e. transforming each column separately

注意：我不希望这与矢量化方法一样快：pd.Series.str.rstrip即分别转换每一列

Answer 2

回答by juanpa.arrivillaga

You can use applyand the str.stripmethod of pd.Series:

你可以使用pd.Seriesapply的str.strip方法：

In [13]: df
Out[13]:
       a       b      c
0    dog   quick    the
1   lazy    lazy    fox
2  brown   quick    dog
3  quick     the   over
4  brown    over   lazy
5    fox   brown  quick
6  quick     fox    the
7    dog  jumped    the
8   lazy   brown    the
9    dog    lazy    the

In [14]: df = df + "@"

In [15]: df
Out[15]:
        a        b       c
0    dog@   quick@    the@
1   lazy@    lazy@    fox@
2  brown@   quick@    dog@
3  quick@     the@   over@
4  brown@    over@   lazy@
5    fox@   brown@  quick@
6  quick@     fox@    the@
7    dog@  jumped@    the@
8   lazy@   brown@    the@
9    dog@    lazy@    the@

In [16]: df = df.apply(lambda S:S.str.strip('@'))

In [17]: df
Out[17]:
       a       b      c
0    dog   quick    the
1   lazy    lazy    fox
2  brown   quick    dog
3  quick     the   over
4  brown    over   lazy
5    fox   brown  quick
6  quick     fox    the
7    dog  jumped    the
8   lazy   brown    the
9    dog    lazy    the

Note, your approach doesn't work because when you do the following assignment in your for-loop:

请注意，您的方法不起作用，因为当您在 for 循环中执行以下分配时：

row = row.str.rstrip('@')

This merely assigns the result of row.str.stripto the name rowwithout mutating the DataFrame. This is the same behavior for all python objects and simple name assignment:

这只是将的结果分配给row.str.strip名称row而不改变DataFrame. 这是所有 python 对象和简单名称分配的相同行为：

In [18]: rows = [[1,2,3],[4,5,6],[7,8,9]]

In [19]: print(rows)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [20]: for row in rows:
    ...:     row = ['look','at','me']
    ...:

In [21]: print(rows)
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

To actually change the underlying data structure you need to use a mutator method:

要实际更改底层数据结构，您需要使用 mutator 方法：

In [22]: rows
Out[22]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [23]: for row in rows:
    ...:     row.append("LOOKATME")
    ...:

In [24]: rows
Out[24]: [[1, 2, 3, 'LOOKATME'], [4, 5, 6, 'LOOKATME'], [7, 8, 9, 'LOOKATME']]

Note that slice-assignment is just syntactic sugar for a mutator method:

请注意，切片分配只是 mutator 方法的语法糖：

In [26]: rows
Out[26]: [[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [27]: for row in rows:
    ...:     row[:] = ['look','at','me']
    ...:
    ...:

In [28]: rows
Out[28]: [['look', 'at', 'me'], ['look', 'at', 'me'], ['look', 'at', 'me']]

This is analogous to pandaslocor ilocbased assignment.

这类似于pandasloc或iloc基于分配。

Answer 3

回答by SummerEla

You could make this real easy and just use pandas.DataFrame.replace()method to replace all "@" with a "":

您可以使这变得非常简单，只需使用pandas.DataFrame.replace()方法将所有“@”替换为“”：

df.replace("@", "")

If you are worried about "@" being replaced not just at the end of your values, you could use regex:

如果您担心“@”不仅在值的末尾被替换，您可以使用正则表达式：

df.replace("@$", "", regex=True)

pandas 数据框：从整个数据框的所有单元格值中添加和删除前缀/后缀

提问by murphy1310

采纳答案by AlexG

回答by juanpa.arrivillaga

回答by SummerEla

相关推荐

最近更新

标签

pandas 数据框：从整个数据框的所有单元格值中添加和删除前缀/后缀

提问by murphy1310

采纳答案by AlexG

回答by juanpa.arrivillaga

回答by SummerEla

相关推荐

如何在 Pandas 的 .csv 文件中写入 DataFrame 时删除索引列？

将 Pandas 单元格中的列表拆分为多列

pandas 多线程中的熊猫数据框

按列总和划分 Pandas 数据框中的列

相关推荐

最近更新

标签