过滤多索引 Python Panda 数据框中的多个项目
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25224545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Filtering multiple items in a multi-index Python Panda dataframe
提问by Tristan Forward
I have the following table:
我有下表:
Note: Both NSRCODE and PBL_AWI are index's
注意:NSRCODE 和 PBL_AWI 都是索引的
Note: the % Of area column would be filled out just have not done so yet.
注意:% Of area 列将被填写,只是还没有这样做。
NSRCODE PBL_AWI Area % Of Area
CM BONS 44705.492941
BTNN 253854.591990
FONG 41625.590370
FONS 16814.159680
Lake 57124.819333
River 1603.906642
SONS 583958.444751
STNN 45603.837177
clearcut 106139.013930
disturbed 127719.865675
lowland 118795.578059
upland 2701289.270193
LBH BFNN 289207.169650
BONS 9140084.716743
BTNI 33713.160390
BTNN 19748004.789040
FONG 1687122.469691
FONS 5169959.591270
FTNI 317251.976160
FTNN 6536472.869395
Lake 258046.508310
River 44262.807900
SONS 4379097.677405
burn regen 744773.210860
clearcut 54066.756790
disturbed 597561.471686
lowland 12591619.141842
upland 23843453.638117
How do I filter out item in the "PBL_AWI" index? For example I want to keep ['Lake', 'River', 'Upland']
如何过滤掉“PBL_AWI”索引中的项目?例如我想保留 ['Lake', 'River', 'Upland']
采纳答案by CT Zhu
You can get_level_valuesin conjunction with Boolean slicing.
你可以get_level_values结合布尔切片。
In [50]:
print df[np.in1d(df.index.get_level_values(1), ['Lake', 'River', 'Upland'])]
Area
NSRCODE PBL_AWI
CM Lake 57124.819333
River 1603.906642
LBH Lake 258046.508310
River 44262.807900
The same idea can be expressed in many different ways, such as df[df.index.get_level_values('PBL_AWI').isin(['Lake', 'River', 'Upland'])]
同一个想法可以用多种不同的方式表达,例如 df[df.index.get_level_values('PBL_AWI').isin(['Lake', 'River', 'Upland'])]
Note that you have 'upland'in your data instead of 'Upland'
请注意,您'upland'的数据中有'Upland'
回答by Pietro Battiston
Also (from here):
另外(从这里):
def filter_by(df, constraints):
"""Filter MultiIndex by sublevels."""
indexer = [constraints[name] if name in constraints else slice(None)
for name in df.index.names]
return df.loc[tuple(indexer)] if len(df.shape) == 1 else df.loc[tuple(indexer),]
pd.Series.filter_by = filter_by
pd.DataFrame.filter_by = filter_by
... to be used as
...用作
df.filter_by({'PBL_AWI' : ['Lake', 'River', 'Upland']})
(untested with Panels and higher dimension elements, but I do expect it to work)
(未经面板和更高维度元素的测试,但我确实希望它能够工作)
回答by Nate
Another (maybe cleaner) way might be this one:
另一种(可能更清洁)的方式可能是这个:
print(df[df.index.isin(['Lake', 'River', 'Upland'], level=1))

