如何在 Pandas 数据框中找到一行的 iloc?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/34897014/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:32:15  来源:igfitidea点击:

How do i find the iloc of a row in pandas dataframe?

pythonpandasdataframe

提问by lmsasu

I have an indexed pandas dataframe. By searching through its index, I find a row of interest. How do I find out the iloc of this row?

我有一个索引的Pandas数据框。通过搜索它的索引,我找到了感兴趣的一行。我如何找出这一行的 iloc?

Example:

例子:

dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df
                   A         B         C         D
2000-01-01 -0.077564  0.310565  1.112333  1.023472
2000-01-02 -0.377221 -0.303613 -1.593735  1.354357
2000-01-03  1.023574 -0.139773  0.736999  1.417595
2000-01-04 -0.191934  0.319612  0.606402  0.392500
2000-01-05 -0.281087 -0.273864  0.154266  0.374022
2000-01-06 -1.953963  1.429507  1.730493  0.109981
2000-01-07  0.894756 -0.315175 -0.028260 -1.232693
2000-01-08 -0.032872 -0.237807  0.705088  0.978011

window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row
Timestamp('2000-01-08 00:00:00', offset='D')
#which is the iloc of window_stop_row?

回答by EdChum

You want the .nameattribute and pass this to get_loc:

您需要该.name属性并将其传递给get_loc

In [131]:
dates = pd.date_range('1/1/2000', periods=8)
df = pd.DataFrame(np.random.randn(8, 4), index=dates, columns=['A', 'B', 'C', 'D'])
df

Out[131]:
                   A         B         C         D
2000-01-01  0.095234 -1.000863  0.899732 -1.742152
2000-01-02 -0.517544 -1.274137  1.734024 -1.369487
2000-01-03  0.134112  1.964386 -0.120282  0.573676
2000-01-04 -0.737499 -0.581444  0.528500 -0.737697
2000-01-05 -1.777800  0.795093  0.120681  0.524045
2000-01-06 -0.048432 -0.751365 -0.760417 -0.181658
2000-01-07 -0.570800  0.248608 -1.428998 -0.662014
2000-01-08 -0.147326  0.717392  3.138620  1.208639

In [133]:    
window_stop_row = df[df.index < '2000-01-04'].iloc[-1]
window_stop_row.name

Out[133]:
Timestamp('2000-01-03 00:00:00', offset='D')

In [134]:
df.index.get_loc(window_stop_row.name)

Out[134]:
2

get_locreturns the ordinal position of the label in your index which is what you want:

get_loc返回标签在索引中的顺序位置,这正是您想要的:

In [135]:    
df.iloc[df.index.get_loc(window_stop_row.name)]

Out[135]:
A    0.134112
B    1.964386
C   -0.120282
D    0.573676
Name: 2000-01-03 00:00:00, dtype: float64

if you just want to search the index then so long as it is sorted then you can use searchsorted:

如果您只想搜索索引,那么只要它已排序,您就可以使用searchsorted

In [142]:
df.index.searchsorted('2000-01-04') - 1

Out[142]:
2

回答by ascripter

While pandas.Index.get_loc()will only work if you have a single key, the following paradigm will also work getting the ilocof multiple elements:

虽然pandas.Index.get_loc()仅当您只有一个键时才有效,但以下范例也适用于获取iloc多个元素:

np.argwhere(condition).flatten()   # array of all iloc where condition is True

In your case, picking the latest element where df.index < '2000-01-04':

在您的情况下,选择最新的元素,其中df.index < '2000-01-04'

np.argwhere(df.index < '2000-01-04').flatten()[-1]  # returns 2

回答by Anton Protopopov

IIUC you could call index for your case:

IIUC 您可以为您的案例调用索引:

In [53]: df[df.index < '2000-01-04'].index[-1]
Out[53]: Timestamp('2000-01-03 00:00:00', offset='D') 

EDIT

编辑

I think @EdChums answer is what you want. Alternatively you could filter your dataframe with values which you get, then use allto find the row with that values and then pass it to the index:

我认为@EdChums 的答案正是您想要的。或者,您可以使用您获得的值过滤数据框,然后用于all查找具有该值的行,然后将其传递给index

In [67]: df == window_stop_row
Out[67]:
                A      B      C      D
2000-01-01  False  False  False  False
2000-01-02  False  False  False  False
2000-01-03   True   True   True   True
2000-01-04  False  False  False  False
2000-01-05  False  False  False  False
2000-01-06  False  False  False  False
2000-01-07  False  False  False  False
2000-01-08  False  False  False  False

In [68]: (df == window_stop_row).all(axis=1)
Out[68]:
2000-01-01    False
2000-01-02    False
2000-01-03     True
2000-01-04    False
2000-01-05    False
2000-01-06    False
2000-01-07    False
2000-01-08    False
Freq: D, dtype: bool

In [69]: df.index[(df == window_stop_row).all(axis=1)]
Out[69]: DatetimeIndex(['2000-01-03'], dtype='datetime64[ns]', freq='D')

回答by Tom Patel

You could try looping through each row in the dataframe:

您可以尝试遍历数据框中的每一行:

    for row_number,row in dataframe.iterrows():
        if row['column_header'] == YourValue:
            print row_number

This will give you the row with which to use the iloc function

这将为您提供使用 iloc 函数的行