pandas 为 MultiIndex DataFrame 中的切片分配新值

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16833842/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-13 20:51:51  来源:igfitidea点击:

Assign new values to slice from MultiIndex DataFrame

pythonpandasmulti-indexdataframe

提问by hadim

I would like to modify some values from a column in my DataFrame. At the moment I have a viewfrom select via the multi index of my original df(and modifying does change df).

我想从我的 DataFrame 中的一列修改一些值。目前,我可以通过我的原始文件的多索引从 select 中查看视图df(并且修改确实改变了df)。

Here's an example:

下面是一个例子:

In [1]: arrays = [np.array(['bar', 'bar', 'baz', 'qux', 'qux', 'bar']),
                  np.array(['one', 'two', 'one', 'one', 'two', 'one']),
                  np.arange(0, 6, 1)]
In [2]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C'])

In [3]: df
                  A         B         C
bar one 0 -0.088671  1.902021 -0.540959
    two 1  0.782919 -0.733581 -0.824522
baz one 2 -0.827128 -0.849712  0.072431
qux one 3 -0.328493  1.456945  0.587793
    two 4 -1.466625  0.720638  0.976438
bar one 5 -0.456558  1.163404  0.464295

I try to modify a slice of dfto a scalar value:

我尝试将切片修改df为标量值:

In [4]: df.ix['bar', 'two', :]['A']
Out[4]:
1    0.782919
Name: A, dtype: float64

In [5]: df.ix['bar', 'two', :]['A'] = 9999
# df is unchanged

I really want to modify severalvalues in the column (and since indexing returns a vector, not a scalar value, I think this would make more sense):

我真的想修改列中的几个值(并且由于索引返回一个向量,而不是标量值,我认为这更有意义):

In [6]: df.ix['bar', 'one', :]['A'] = [999, 888]
# again df remains unchanged

I'm using pandas 0.11. Is there is a simple way to do this?

我正在使用Pandas 0.11。有没有一种简单的方法可以做到这一点?

The current solution is to recreate df from a new one and modify values I want to. But it's not elegant and can be very heavy on complex dataframe. In my opinion the problem should come from .ix and .loc not returning a view but a copy.

当前的解决方案是从一个新的 df 重新创建 df 并修改我想要的值。但这并不优雅,并且在复杂的数据帧上可能非常繁重。在我看来,问题应该来自 .ix 和 .loc 不返回视图而是返回副本。

采纳答案by Jeff

Sort the frame, then select/set using a tuple for the multi-index

对框架进行排序,然后使用多索引的元组选择/设置

In [12]: df = pd.DataFrame(randn(6, 3), index=arrays, columns=['A', 'B', 'C'])

In [13]: df
Out[13]: 
                  A         B         C
bar one 0 -0.694240  0.725163  0.131891
    two 1 -0.729186  0.244860  0.530870
baz one 2  0.757816  1.129989  0.893080
qux one 3 -2.275694  0.680023 -1.054816
    two 4  0.291889 -0.409024 -0.307302
bar one 5  1.697974 -1.828872 -1.004187

In [14]: df = df.sortlevel(0)

In [15]: df
Out[15]: 
                  A         B         C
bar one 0 -0.694240  0.725163  0.131891
        5  1.697974 -1.828872 -1.004187
    two 1 -0.729186  0.244860  0.530870
baz one 2  0.757816  1.129989  0.893080
qux one 3 -2.275694  0.680023 -1.054816
    two 4  0.291889 -0.409024 -0.307302

In [16]: df.loc[('bar','two'),'A'] = 9999

In [17]: df
Out[17]: 
                     A         B         C
bar one 0    -0.694240  0.725163  0.131891
        5     1.697974 -1.828872 -1.004187
    two 1  9999.000000  0.244860  0.530870
baz one 2     0.757816  1.129989  0.893080
qux one 3    -2.275694  0.680023 -1.054816
    two 4     0.291889 -0.409024 -0.307302

You can also do it with out sorting if you specify the complete index, e.g.

如果您指定完整的索引,您也可以不进行排序,例如

In [23]: df.loc[('bar','two',1),'A'] = 999

In [24]: df
Out[24]: 
                    A         B         C
bar one 0   -0.113216  0.878715 -0.183941
    two 1  999.000000 -1.405693  0.253388
baz one 2    0.441543  0.470768  1.155103
qux one 3   -0.008763  0.917800 -0.699279
    two 4    0.061586  0.537913  0.380175
bar one 5    0.857231  1.144246 -2.369694

To check the sort depth

检查排序深度

In [27]: df.index.lexsort_depth
Out[27]: 0

In [28]: df.sortlevel(0).index.lexsort_depth
Out[28]: 3

The last part of your question, assigning with a list (note that you must have the same number of elements as you are trying to replace), and this MUST be sorted for this to work

问题的最后一部分,分配一个列表(请注意,您必须具有与您尝试替换的元素数量相同的元素),并且必须对其进行排序才能使其正常工作

In [12]: df.loc[('bar','one'),'A'] = [999,888]

In [13]: df
Out[13]:?
? ? ? ? ? ? ? ? ? ? A ? ? ? ? B ? ? ? ? C
bar one 0 ?999.000000 -0.645641 ?0.369443
? ? ? ? 5 ?888.000000 -0.990632 -0.577401
? ? two 1 ? -1.071410 ?2.308711 ?2.018476
baz one 2 ? ?1.211887 ?1.516925 ?0.064023
qux one 3 ? -0.862670 -0.770585 -0.843773
? ? two 4 ? -0.644855 -1.431962 ?0.232528