在 Pandas Series 或 DataFrame 中查找最后一个真值的索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/34384349/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Find index of last true value in pandas Series or DataFrame
提问by user1507844
I'm trying to find the index of the last True value in a pandas boolean Series. My current code looks something like the below. Is there a faster or cleaner way of doing this?
我试图在Pandas布尔系列中找到最后一个 True 值的索引。我当前的代码如下所示。有没有更快或更干净的方法来做到这一点?
import numpy as np
import pandas as pd
import string
index = np.random.choice(list(string.ascii_lowercase), size=1000)
df = pd.DataFrame(np.random.randn(1000, 2), index=index)
s = pd.Series(np.random.choice([True, False], size=1000), index=index)
last_true_idx_s = s.index[s][-1]
last_true_idx_df = df[s].iloc[-1].name
回答by jezrael
You can use idxmax
what is the same as argmaxof Andy Hayden answer:
您可以使用idxmax
什么相同argmax安迪·海登的答案:
print s[::-1].idxmax()
Comparing:
比较:
These timings are going to be very dependent on the size of s as well as the number (and position) of Trues - thanks.
这些时间将非常依赖于 s 的大小以及 True 的数量(和位置) -谢谢。
In [2]: %timeit s.index[s][-1]
The slowest run took 6.92 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 35 μs per loop
In [3]: %timeit s[::-1].argmax()
The slowest run took 6.67 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 126 μs per loop
In [4]: %timeit s[::-1].idxmax()
The slowest run took 6.55 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 127 μs per loop
In [5]: %timeit s[s==True].last_valid_index()
The slowest run took 8.10 times longer than the fastest. This could mean that an intermediate result is being cached
1000 loops, best of 3: 261 μs per loop
In [6]: %timeit (s[s==True].index.tolist()[-1])
The slowest run took 6.11 times longer than the fastest. This could mean that an intermediate result is being cached
1000 loops, best of 3: 239 μs per loop
In [7]: %timeit (s[s==True].index[-1])
The slowest run took 5.75 times longer than the fastest. This could mean that an intermediate result is being cached
1000 loops, best of 3: 227 μs per loop
EDIT:
编辑:
Next solution:
下一个解决方案:
print s[s==True].index[-1]
EDIT1: Solution
EDIT1:解决方案
(s[s==True].index.tolist()[-1])
was in deleted answer.
在已删除的答案中。
回答by EdChum
Use last_valid_index
:
In [9]:
s.tail(10)
Out[9]:
h False
w True
h False
r True
q False
b False
p False
e False
q False
d False
dtype: bool
In [8]:
s[s==True].last_valid_index()
Out[8]:
'r'