Python Pandas 中的 DataFrame.apply 改变原始数据帧和重复数据帧

Question

提问by MikeGruz

I'm having a bit of trouble altering a duplicated pandas DataFrame and not having the edits apply to both the duplicate andthe original DataFrame.

我在更改重复的 Pandas DataFrame 时遇到了一些麻烦，并且没有将编辑同时应用于重复数据帧和原始数据帧。

Here's an example. Say I create an arbitrary DataFrame from a list of dictionaries:

这是一个例子。假设我从字典列表中创建了一个任意的 DataFrame：

In [67]: d = [{'a':3, 'b':5}, {'a':1, 'b':1}]

In [68]: d = DataFrame(d)

In [69]: d

Out[69]: 
   a  b
0  3  5
1  1  1

Then I assign the 'd' dataframe to variable 'e' and apply some arbitrary math to column 'a' using apply:

然后我将 'd' 数据框分配给变量 'e' 并使用 apply 对列 'a' 应用一些任意数学：

In [70]: e = d

In [71]: e['a'] = e['a'].apply(lambda x: x + 1)

The problem arises in that the apply function apparently applies to both the duplicate DataFrame 'e' and original DataFrame 'd', which I cannot for the life of me figure out:

问题在于 apply 函数显然同时适用于重复的 DataFrame 'e' 和原始的 DataFrame 'd'，我一生都无法弄清楚：

In [72]: e # duplicate DataFrame
Out[72]: 
   a  b
0  4  5
1  2  1

In [73]: d # original DataFrame, notice the alterations to frame 'e' were also applied
Out[73]:  
   a  b
0  4  5
1  2  1

I've searched both the pandas documentation and Google for a reason why this would be so, but to no avail. I can't understand what is going on here at all.

我已经搜索了 pandas 文档和谷歌的原因，但无济于事。我完全不明白这里发生了什么。

I've also tried the math operations using a element-wise operation (e.g., e['a'] = [i + 1 for i in e['a']]), but the problem persists. Is there a quirk in the pandas DataFrame type that I'm not aware of? I appreciate any insight someone might be able to offer.

我也尝试过使用逐元素运算（例如e['a'] = [i + 1 for i in e['a']]）的数学运算，但问题仍然存在。pandas DataFrame 类型中是否有我不知道的怪癖？我很感激有人可能提供的任何见解。

Answer 1

回答by BrenBarn

This is not a pandas-specific issue. In Python, assignment never copies anything:

这不是熊猫特有的问题。在 Python 中，赋值永远不会复制任何东西：

>>> a = [1,2,3]
>>> b = a
>>> b[0] = 'WHOA!'
>>> a
['WHOA!', 2, 3]

If you want a new DataFrame, make a copy with e = d.copy().

如果您想要一个新的 DataFrame，请使用e = d.copy().

Edit: I should clarify that assignment to a bare namenever copies anything. Assignment to an item or attribute (e.g., a[1] = xor a.foo = bar) is converted into method calls under the hood and may do copying depending on what kind of object ais.

编辑：我应该澄清分配给一个裸名永远不会复制任何东西。对项目或属性（例如，a[1] = x或a.foo = bar）的赋值被转换为引擎盖下的方法调用，并且可以根据对象的类型进行复制a。

Python Pandas 中的 DataFrame.apply 改变原始数据帧和重复数据帧

提问by MikeGruz

回答by BrenBarn

相关推荐

最近更新

标签

Python Pandas 中的 DataFrame.apply 改变原始数据帧和重复数据帧

提问by MikeGruz

回答by BrenBarn

相关推荐

pandas 在 Python 中计算复合收益系列

按升序对 Pandas DataMatrix 进行排序

如何使用 Pandas 获得两个时间序列之间的相关性

为什么 2012 年 Python 中的 Pandas 合并速度比 R 中的 data.table 合并速度快？

相关推荐

最近更新

标签