Python[pandas]:通过另一个数据帧的索引选择某些行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/48864923/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Python[pandas]: Select certain rows by index of another dataframe
提问by giupardeb
I have a dataframe and I would select only rows that contain index value into df1.index.
我有一个数据框,我只会选择包含索引值的行到 df1.index 中。
for Example:
例如:
In [96]: df
Out[96]:
A B C D
1 1 4 9 1
2 4 5 0 2
3 5 5 1 0
22 1 3 9 6
and these indexes
和这些索引
In[96]:df1.index
Out[96]:
Int64Index([ 1, 3, 4, 5, 6, 7, 22, 28, 29, 32,], dtype='int64', length=253)
I would like this output:
我想要这个输出:
In [96]: df
Out[96]:
A B C D
1 1 4 9 1
3 5 5 1 0
22 1 3 9 6
thanks
谢谢
回答by jezrael
Use isin
:
使用isin
:
df = df[df.index.isin(df1.index)]
Or get all intersectioned indices and select by loc
:
或者获取所有交叉索引并选择loc
:
df = df.loc[df.index & df1.index]
df = df.loc[np.intersect1d(df.index, df1.index)]
df = df.loc[df.index.intersection(df1.index)]
print (df)
A B C D
1 1 4 9 1
3 5 5 1 0
22 1 3 9 6
EDIT:
编辑:
I tried solution: df = df.loc[df1.index]. Do you think that this solution is correct?
我尝试了解决方案:df = df.loc[df1.index]。你认为这个解决方案正确吗?
Solution is incorrect:
解决方法不正确:
df = df.loc[df1.index]
print (df)
A B C D
1 1.0 4.0 9.0 1.0
3 5.0 5.0 1.0 0.0
4 NaN NaN NaN NaN
5 NaN NaN NaN NaN
6 NaN NaN NaN NaN
7 NaN NaN NaN NaN
22 1.0 3.0 9.0 6.0
28 NaN NaN NaN NaN
29 NaN NaN NaN NaN
32 NaN NaN NaN NaN
C:/Dropbox/work-joy/so/_t/t.py:23: FutureWarning:
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.
See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
print (df)
回答by Spcogg the second
Passing the index to the row indexer/slicer of .loc now works, you just need to make sure to specify the columns as well, i.e.:
将索引传递给 .loc 的行索引器/切片器现在可以工作,您只需要确保也指定列,即:
df = df.loc[df1.index, :] # works
and NOT
并不是
df = df.loc[df1.index] # won't work
IMO This is more neater/consistent with the expected usage of .loc
IMO 这与 .loc 的预期用法更整洁/一致