在 Pandas Series 或 DataFrame 中查找最后一个真值的索引

Question

提问by user1507844

I'm trying to find the index of the last True value in a pandas boolean Series. My current code looks something like the below. Is there a faster or cleaner way of doing this?

我试图在Pandas布尔系列中找到最后一个 True 值的索引。我当前的代码如下所示。有没有更快或更干净的方法来做到这一点？

import numpy as np
import pandas as pd
import string

index = np.random.choice(list(string.ascii_lowercase), size=1000)
df = pd.DataFrame(np.random.randn(1000, 2), index=index)
s = pd.Series(np.random.choice([True, False], size=1000), index=index)

last_true_idx_s = s.index[s][-1]
last_true_idx_df = df[s].iloc[-1].name

Answer 1

回答by jezrael

You can use idxmaxwhat is the same as argmaxof Andy Hayden answer:

您可以使用idxmax什么相同argmax安迪·海登的答案：

print s[::-1].idxmax()

Comparing:

比较：

These timings are going to be very dependent on the size of s as well as the number (and position) of Trues - thanks.

这些时间将非常依赖于 s 的大小以及 True 的数量（和位置） -谢谢。

In [2]: %timeit s.index[s][-1]
The slowest run took 6.92 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 35 μs per loop

In [3]: %timeit s[::-1].argmax()
The slowest run took 6.67 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 126 μs per loop

In [4]: %timeit s[::-1].idxmax()
The slowest run took 6.55 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 127 μs per loop

In [5]: %timeit s[s==True].last_valid_index()
The slowest run took 8.10 times longer than the fastest. This could mean that an intermediate result is being cached 
1000 loops, best of 3: 261 μs per loop

In [6]: %timeit (s[s==True].index.tolist()[-1])
The slowest run took 6.11 times longer than the fastest. This could mean that an intermediate result is being cached 
1000 loops, best of 3: 239 μs per loop

In [7]: %timeit (s[s==True].index[-1])
The slowest run took 5.75 times longer than the fastest. This could mean that an intermediate result is being cached 
1000 loops, best of 3: 227 μs per loop

EDIT:

编辑：

Next solution:

下一个解决方案：

print s[s==True].index[-1]

EDIT1: Solution

EDIT1：解决方案

(s[s==True].index.tolist()[-1])

was in deleted answer.

在已删除的答案中。

Answer 2

回答by EdChum

Use last_valid_index:

使用last_valid_index：

In [9]:
s.tail(10)

Out[9]:
h    False
w     True
h    False
r     True
q    False
b    False
p    False
e    False
q    False
d    False
dtype: bool

In [8]:
s[s==True].last_valid_index()

Out[8]:
'r'

Answer 3

回答by Andy Hayden

argmaxgets the first True. Use argmax on the reversedSeries:

argmax得到第一个 True。在反向系列上使用 argmax ：

In [11]: s[::-1].argmax()
Out[11]: 'e'

Here:

这里：

In [12]: s.tail()
Out[12]:
n     True
e     True
k    False
d    False
l    False
dtype: bool

在 Pandas Series 或 DataFrame 中查找最后一个真值的索引

提问by user1507844

回答by jezrael

回答by EdChum

回答by Andy Hayden

相关推荐

最近更新

标签

在 Pandas Series 或 DataFrame 中查找最后一个真值的索引

提问by user1507844

回答by jezrael

回答by EdChum

回答by Andy Hayden

相关推荐

pandas Python：如何比较两个数据框

从 Pandas 数据框中删除带有空列表的行

pandas 如何重新索引熊猫数据帧以将起始索引值重置为零？

pandas 在 scikit-learn 中进行一种热编码的可能方法？

相关推荐

最近更新

标签