如何为 Pandas 数据框的某些选定行集体设置多列的值？

Question

提问by bigbug

I have a dataframe dfwhich has 'TPrice','THigh','TLow','TOpen','TClose','TPCLOSE'columns, and now I want to set 'TPrice','THigh','TLow','TOpen','TClose'columns values to be same as 'TPCLOSE'column for the rows whose TPricecolumn value is zero.

我有一个df包含'TPrice','THigh','TLow','TOpen','TClose','TPCLOSE'列的数据框，现在我想将'TPrice','THigh','TLow','TOpen','TClose'列值设置'TPCLOSE'为与TPrice列值为零的行的列相同。

Show some rows whose TPrice is 0:

显示一些 TPrice 为 0 的行：

>>> df[df['TPrice']==0][['TPrice','THigh','TLow','TOpen','TClose','TPCLOSE']][0:5]
    TPrice  THigh  TLow  TOpen  TClose  TPCLOSE
13       0      0     0      0       0     4.19
19       0      0     0      0       0     7.74
32       0      0     0      0       0     3.27
43       0      0     0      0       0    12.98
60       0      0     0      0       0     7.48

Then assignment :

然后赋值：

>>> df[df['TPrice']==0][['TPrice','THigh','TLow','TOpen','TClose']] = df['TPCLOSE']

But Pandas doesn't really change df , for below code still can find some rows:

但是 Pandas 并没有真正改变 df ，因为下面的代码仍然可以找到一些行：

>>> df[df['TPrice']==0][['TPrice','THigh','TLow','TOpen','TClose','TPCLOSE']][0:5]
    TPrice  THigh  TLow  TOpen  TClose  TPCLOSE
13       0      0     0      0       0     4.19
19       0      0     0      0       0     7.74
32       0      0     0      0       0     3.27
43       0      0     0      0       0    12.98
60       0      0     0      0       0     7.48

So how to do ?

那怎么办？

Update for Jeff solution:

Jeff 解决方案的更新：

>>> quote_df = get_quote()
>>> quote_df[quote_df['TPrice']==0][['TPrice','THigh','TLow','TOpen','TClose','TPCLOSE','RT','TVol']][0:5]
    TPrice  THigh  TLow  TOpen  TClose  TPCLOSE   RT  TVol
13       0      0     0      0       0     4.19 -100     0
32       0      0     0      0       0     3.27 -100     0
43       0      0     0      0       0    12.98 -100     0
45       0      0     0      0       0    26.74 -100     0
60       0      0     0      0       0     7.48 -100     0
>>> row_selection = quote_df['TPrice']==0
>>> col_selection = ['THigh','TLow','TOpen','TClose']
>>> for col in col_selection:
...     quote_df.loc[row_selection, col] = quote_df['TPCLOSE']
... 
>>> quote_df[quote_df['TPrice']==0][['TPrice','THigh','TLow','TOpen','TClose','TPCLOSE','RT','TVol']][0:5]
    TPrice  THigh  TLow  TOpen  TClose  TPCLOSE   RT  TVol
13       0   4.19  4.19   4.19    4.19     4.19 -100     0
32       0   4.19  4.19   4.19    4.19     3.27 -100     0
43       0   4.19  4.19   4.19    4.19    12.98 -100     0
45       0   4.19  4.19   4.19    4.19    26.74 -100     0
60       0   4.19  4.19   4.19    4.19     7.48 -100     0
>>>

Answer 1

回答by Jeff

This operation is not automatically broadcast, so you need to do something like this

这个操作不是自动广播的，所以你需要做这样的事情

In [17]: df = DataFrame(dict(A = [1,2,0,0,0],B=[0,0,0,10,11],C=[3,4,5,6,7]))

In [18]: df
Out[18]: 
   A   B  C
0  1   0  3
1  2   0  4
2  0   0  5
3  0  10  6
4  0  11  7

Compute which rows you want to mask first (otherwise they might change as you go) if you are modifying A (as you are here)

如果您正在修改 A（就像您在这里一样），请计算您要首先屏蔽哪些行（否则它们可能会随着您的进行而改变）

In [19]: mask = df['A'] == 0

In [20]: for col in ['A','B']:
   ....:     df.loc[mask,col] = df['C']
   ....:     

In [21]: df
Out[21]: 
   A  B  C
0  1  0  3
1  2  0  4
2  5  5  5
3  6  6  6
4  7  7  7

This requires a change to make it more natural (as you are assigning a series on the rhs to a dataframe on the lhs, which right now doesn't broadcast like you would think it should) https://github.com/pydata/pandas/issues/5206

这需要进行更改以使其更自然（因为您将 rhs 上的系列分配给 lhs 上的数据帧，该数据帧现在不会像您认为的那样广播） https://github.com/pydata/Pandas/问题/5206

Answer 2

回答by Def_Os

>>> import pandas as pd
>>> test=pd.DataFrame({'A': [0,1,2], 'B': [3,4,5], 'C': [6,7,8]})
>>> test
   A  B  C
0  0  3  6
1  1  4  7
2  2  5  8
>>> test.apply(lambda x: x.where(test.A!=0, test.C), axis=0)
   A  B  C
0  6  6  6
1  1  4  7
2  2  5  8

如何为 Pandas 数据框的某些选定行集体设置多列的值？

提问by bigbug

回答by Jeff

回答by Def_Os

相关推荐

最近更新

标签

如何为 Pandas 数据框的某些选定行集体设置多列的值？

提问by bigbug

回答by Jeff

回答by Def_Os

相关推荐

合并 Pandas DataFrame DateTime 列

在 Pandas 中将一个时间序列插入另一个时间序列

pandas 如何根据条形图的值在 matplotlib 中创建自定义图例？

为 Python 2.7 64 位安装 Pandas - 无法找到 vcvarsall.bat 的错误

相关推荐

最近更新

标签