pandas 如何使用python pandas通过多索引获取价值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36510146/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:01:19  来源:igfitidea点击:

How to get value by multi-index with python pandas?

pythonpandasdataframe

提问by xirururu

How can I get the value from a dataframe by its multi-index?

如何通过多索引从数据框中获取值?

For example I have a dataframe mm:

例如我有一个数据框mm

np.random.seed(1)
mm = pd.DataFrame(np.random.randn(5,2))
mm['A'] = np.arange(5)
mm['B'] = np.arange(5,10)
mm.set_index(['A','B'], inplace=True)

print mm

        0         1
A B                    
0 5  1.624345 -0.611756
1 6 -0.528172 -1.072969
2 7  0.865408 -2.301539
3 8  1.744812 -0.761207
4 9  0.319039 -0.249370

I want to get the value where A = 2, B = 7, how can I do that?

我想获得 A = 2, B = 7 的值,我该怎么做?

Is it possible to write a function like get_value(mm, (2,7)), then I can get the following result:

是否可以写一个类似的函数get_value(mm, (2,7)),然后我可以得到以下结果:

2 7  0.865408 -2.301539

回答by unutbu

Use mm.locto select rows by label:

用于mm.loc按标签选择行:

In [28]: row = mm.loc[2,7]; row
Out[28]: 
0    0.865408
1   -2.301539
Name: (2, 7), dtype: float64

In [40]: np.concatenate([row.name, row])
Out[40]: array([ 2.        ,  7.        ,  0.86540763, -2.3015387 ])

Since mmhas a MultiIndex, each row label is expressed as a tuple (e.g. (2,7)). When there is no ambiguity, such as inside brackets, the parentheses can be dropped: mm.loc[2, 7]is equivalent to mm.loc[(2, 7)].

由于mm有一个 MultiIndex,每个行标签都表示为一个元组(例如(2,7))。当没有歧义时,例如在括号内,可以去掉括号:mm.loc[2, 7]相当于mm.loc[(2, 7)]



To get all rows where B=7, you could

要获取所有行 where B=7,您可以

  • use pd.IndexSlice:

    xs = pd.IndexSlice
    mm.loc[xs[:, 7], :]
    
  • or the mm.querymethod:

    mm.query('B==7')
    
  • or mm.index.get_loc_levelwith mm.loc:

    mask, idx = index.get_loc_level(7, level='B')
    mm.loc[mask]
    
  • or mm.index.get_loc_levelwith mm.iloc:

    mask, idx = index.get_loc_level(7, level='B')
    mm.iloc[idx]
    
  • 使用 pd.IndexSlice

    xs = pd.IndexSlice
    mm.loc[xs[:, 7], :]
    
  • mm.query方法

    mm.query('B==7')
    
  • mm.index.get_loc_levelmm.loc

    mask, idx = index.get_loc_level(7, level='B')
    mm.loc[mask]
    
  • mm.index.get_loc_levelmm.iloc

    mask, idx = index.get_loc_level(7, level='B')
    mm.iloc[idx]
    

Each of the expressions above return the DataFrame

上面的每个表达式都返回 DataFrame

            0         1
A B                    
2 7  0.865408 -2.301539

回答by Alexander

This returns your selection as a dataframe:

这将您的选择作为数据框返回:

>>> mm.loc[[(2, 7)]]
            0         1
A B                    
2 7  0.865408 -2.301539

To get the index and values:

获取索引和值:

>>> mm.loc[[(2, 7)]].reset_index().values.tolist()[0]
[2.0, 7.0, 0.8654076293246785, -2.3015386968802827]

To get all values where the second item is 7:

要获取第二项为 7 的所有值:

idx = pd.IndexSlice
>>> mm.loc[idx[:, 7], :]
            0         1
A B                    
2 7  0.865408 -2.301539