pandas 应用函数后,在 DataFrame 中就地更改系列

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/30276745/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 23:21:37  来源:igfitidea点击:

Change Series inplace in DataFrame after applying function on it

pythonpandas

提问by Infinity

I'm trying to use pandasin order to change one of my columns in-place, using simple function.

我正在尝试pandas使用简单的函数来就地更改我的一列。

After reading the whole Dataframe, I tried to apply function on one Serie:

阅读整个 Dataframe 后,我尝试在一个系列上应用函数:

wanted_data.age.apply(lambda x: x+1)

And it's working great. The only problem occurs when I try to put it back into my DataFrame:

而且效果很好。当我尝试将它放回我的 DataFrame 时,会出现唯一的问题:

wanted_data.age = wanted_data.age.apply(lambda x: x+1)

or:

或者:

wanted_data['age'] = wanted_data.age.apply(lambda x: x+1)

Throwing the following warning:

抛出以下警告:

> C:\Anaconda\lib\site-packages\pandas\core\generic.py:1974:
> SettingWithCopyWarning: A value is trying to be set on a copy of a
> slice from a DataFrame. Try using .loc[row_indexer,col_indexer] =
> value instead
> 
> See the the caveats in the documentation:
> http://pandas.pydata.org/pandas-docs/stable
> /indexing.html#indexing-view-versus-copy   self[name] = value

Of Course, I can set the DataFrame using the long form of:

当然,我可以使用以下长格式设置 DataFrame:

wanted_data.loc[:, 'age'] = wanted_data.age.apply(lambda x: x+1)

But is there no other, easier and more syntactic-nicer way to do it?

但是有没有其他更简单、更语法更好的方法来做到这一点?

Thanks!

谢谢!

采纳答案by Alexander

Use loc:

使用loc

wanted_data.loc[:, 'age'] = wanted_data.age.apply(lambda x: x + 1)

回答by Irfanullah

I would suggest wanted_data['age']= wanted_data['age'].apply(lambda x: x+1),then save file as wanted_data.to_csv(fname,index=False), where "fname" is the name of a file to be updated.

我建议 wanted_data['age']= wanted_data['age'].apply(lambda x: x+1),然后将文件另存为 wanted_data.to_csv(fname,index=False),其中“fname”是要更新的文件的名称。

回答by Thanasis Mattas

I cannot comment, so I'll leave this as an answer.

我无法发表评论,所以我将把它作为答案。

Because of the way chained indexing is hundled internally, you may get back a deep copy, instead of a reference to your initial DataFrame (For more see chained assignment - this is a very good source. Bare .loc[] always returns a reference). Thus, you may not assign back to your DataFrame, but to a copy of it. On the other hand, your format may return a reference to your initial Dataframe and, while mutating it, the initial DataFrame will mutate, too. Python prints this warning to beat the drum for the situation, so as the user can decide whether this is the wanted treatment or not.

由于链式索引在内部打包的方式,您可能会得到一个深层副本,而不是对初始 DataFrame 的引用(有关更多信息,请参阅链式分配 -这是一个非常好的来源。裸 .loc[] 总是返回一个引用) . 因此,您可能不会分配回您的 DataFrame,而是分配给它的副本。另一方面,您的格式可能会返回对初始 Dataframe 的引用,并且在对其进行变异时,初始 DataFrame 也会发生变异。Python 打印此警告以应对这种情况,以便用户可以决定这是否是想要的处理方式。

If you know what you're doing, you can silence the warning using:

如果您知道自己在做什么,则可以使用以下命令使警告静音:

with pd.options.mode.chained_assignment = "None":
    wanted_data.age = wanted_data.age.apply(lambda x: x+1)

If you think that this is an important manner (e.g. there is the possibility of unintentionally mutating the initial DataFrame), you can set the above option to "raise", so that an error would be raised, instead of a warning.

如果您认为这是一种重要的方式(例如,有可能无意中改变了初始 DataFrame),您可以将上述选项设置为“raise”,这样就会引发错误,而不是警告。

Also, I think usage of the term "inplace" is not fully correct. "inplace" is used as an argument at some methods, so as to mutate an object without assigning it to itself (the assignment is hundled internally), and apply() does not support this feature.

另外,我认为“就地”一词的用法并不完全正确。“就地”在某些方法中用作参数,以便在不将对象分配给自身的情况下对其进行变异(分配在内部进行),而 apply() 不支持此功能。