Pandas:在 Dataframe 子集上使用 iterrows

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19666218/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 21:17:37  来源:igfitidea点击:

Pandas: Use iterrows on Dataframe subset

pythonloopspandassubset

提问by Andy

What is the best way to do iterrows with a subset of a DataFrame?

使用 DataFrame 的子集进行迭代的最佳方法是什么?

Let's take the following simple example:

让我们举一个简单的例子:

import pandas as pd

df = pd.DataFrame({
  'Product': list('AAAABBAA'),
  'Quantity': [5,2,5,10,1,5,2,3],
  'Start' : [
      DT.datetime(2013,1,1,9,0),
      DT.datetime(2013,1,1,8,5),
      DT.datetime(2013,2,5,14,0),
      DT.datetime(2013,2,5,16,0),
      DT.datetime(2013,2,8,20,0),                                      
      DT.datetime(2013,2,8,16,50),
      DT.datetime(2013,2,8,7,0),
      DT.datetime(2013,7,4,8,0)]})

df = df.set_index(['Start'])

Now I would like to modify a subset of this DataFrame using the itterrows function, e.g.:

现在我想使用 itterrows 函数修改这个 DataFrame 的一个子集,例如:

for i, row_i in df[df.Product == 'A'].iterrows():
    row_i['Product'] = 'A1' # actually a more complex calculation

However, the changes do not persist.

然而,变化不会持续。

Is there any possibility (except a manual lookup using the index 'i') to make persistent changes on the original Dataframe ?

是否有可能(使用索引“i”进行手动查找除外)对原始 Dataframe 进行持久更改?

回答by Roman Pekar

Why do you need iterrows() for this? I think it's always preferrable to use vectorized operations in pandas (or numpy):

为什么你需要 iterrows() 呢?我认为在 Pandas(或 numpy)中使用矢量化操作总是更可取的:

df.ix[df['Product'] == 'A', "Product"] = 'A1'

回答by Magellan88

I guess the best way that comes to my mind is to generate a new vector with the desired result, where you can loop all you want and then reassign it back to the column

我想我想到的最好方法是生成一个具有所需结果的新向量,您可以在其中循环所有想要的内容,然后将其重新分配回列

#make a copy of the column
P = df.Product.copy()
#do the operation or loop if you really must
P[ P=="A" ] = "A1"
#reassign to original df
df["Product"] = P