Python[pandas]:通过另一个数据帧的索引选择某些行

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/48864923/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 05:12:26  来源:igfitidea点击:

Python[pandas]: Select certain rows by index of another dataframe

pythonpandasdataframe

提问by giupardeb

I have a dataframe and I would select only rows that contain index value into df1.index.

我有一个数据框,我只会选择包含索引值的行到 df1.index 中。

for Example:

例如:

In [96]: df
Out[96]:
   A  B  C  D
1  1  4  9  1
2  4  5  0  2
3  5  5  1  0
22 1  3  9  6

and these indexes

和这些索引

In[96]:df1.index
Out[96]:
Int64Index([  1,   3,   4,   5,   6,   7,  22,  28,  29,  32,], dtype='int64', length=253)

I would like this output:

我想要这个输出:

In [96]: df
Out[96]:
   A  B  C  D
1  1  4  9  1
3  5  5  1  0
22 1  3  9  6

thanks

谢谢

回答by jezrael

Use isin:

使用isin

df = df[df.index.isin(df1.index)]

Or get all intersectioned indices and select by loc:

或者获取所有交叉索引并选择loc

df = df.loc[df.index & df1.index]
df = df.loc[np.intersect1d(df.index, df1.index)]
df = df.loc[df.index.intersection(df1.index)]


print (df)
    A  B  C  D
1   1  4  9  1
3   5  5  1  0
22  1  3  9  6

EDIT:

编辑:

I tried solution: df = df.loc[df1.index]. Do you think that this solution is correct?

我尝试了解决方案:df = df.loc[df1.index]。你认为这个解决方案正确吗?

Solution is incorrect:

解决方法不正确:

df = df.loc[df1.index]
print (df)

      A    B    C    D
1   1.0  4.0  9.0  1.0
3   5.0  5.0  1.0  0.0
4   NaN  NaN  NaN  NaN
5   NaN  NaN  NaN  NaN
6   NaN  NaN  NaN  NaN
7   NaN  NaN  NaN  NaN
22  1.0  3.0  9.0  6.0
28  NaN  NaN  NaN  NaN
29  NaN  NaN  NaN  NaN
32  NaN  NaN  NaN  NaN
C:/Dropbox/work-joy/so/_t/t.py:23: FutureWarning: 
Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike
  print (df)

回答by Spcogg the second

Passing the index to the row indexer/slicer of .loc now works, you just need to make sure to specify the columns as well, i.e.:

将索引传递给 .loc 的行索引器/切片器现在可以工作,您只需要确保也指定列,即:

df = df.loc[df1.index, :]  # works

and NOT

并不是

df = df.loc[df1.index] # won't work

IMO This is more neater/consistent with the expected usage of .loc

IMO 这与 .loc 的预期用法更整洁/一致