Python 数据框中最后一个元素的访问索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15862034/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Access index of last element in data frame
提问by elelias
I've looking around for this but I can't seem to find it (though it must be extremely trivial).
我四处寻找这个,但我似乎无法找到它(尽管它一定是非常微不足道的)。
The problem that I have is that I would like to retrieve the value of a column for the first and last entries of a data frame. But if I do:
我遇到的问题是我想为数据框的第一个和最后一个条目检索列的值。但如果我这样做:
df.ix[0]['date']
I get:
我得到:
datetime.datetime(2011, 1, 10, 16, 0)
but if I do:
但如果我这样做:
df[-1:]['date']
I get:
我得到:
myIndex
13 2011-12-20 16:00:00
Name: mydate
with a different format. Ideally, I would like to be able to access the value of the last index of the data frame, but I can't find how.
用不同的格式。理想情况下,我希望能够访问数据框的最后一个索引的值,但我找不到方法。
I even tried to create a column (IndexCopy) with the values of the index and try:
我什至尝试使用索引值创建一个列(IndexCopy)并尝试:
df.ix[df.tail(1)['IndexCopy']]['mydate']
but this also yields a different format (since df.tail(1)['IndexCopy'] does not output a simple integer).
但这也会产生不同的格式(因为 df.tail(1)['IndexCopy'] 不输出简单的整数)。
Any ideas?
有任何想法吗?
采纳答案by DSM
The former answer is now superseded by .iloc:
以前的答案现在被取代.iloc:
>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
date
17 10
18 18
19 26
20 34
21 42
22 50
23 58
>>> df["date"].iloc[0]
10
>>> df["date"].iloc[-1]
58
The shortest way I can think of uses .iget():
我能想到的最短方法是.iget():
>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
date
17 10
18 18
19 26
20 34
21 42
22 50
23 58
>>> df['date'].iget(0)
10
>>> df['date'].iget(-1)
58
Alternatively:
或者:
>>> df['date'][df.index[0]]
10
>>> df['date'][df.index[-1]]
58
There's also .first_valid_index()and .last_valid_index(), but depending on whether or not you want to rule out NaNs they might not be what you want.
还有.first_valid_index()and .last_valid_index(),但取决于您是否要排除NaNs 它们可能不是您想要的。
Remember that df.ix[0]doesn't give you the first, but the one indexed by 0. For example, in the above case, df.ix[0]would produce
请记住,df.ix[0]这不会为您提供第一个,而是由 0 索引的那个。例如,在上述情况下,df.ix[0]将产生
>>> df.ix[0]
Traceback (most recent call last):
File "<ipython-input-489-494245247e87>", line 1, in <module>
df.ix[0]
[...]
KeyError: 0
回答by comte
df.tail(1).index
seems the most readable
似乎是最易读的
回答by Tai
Combining @comte's answer and dmdip's answer in Get index of a row of a pandas dataframe as an integer
结合@comte's answer和dmdip's answer in Get index of a row of a pandas dataframe as an integer
df.tail(1).index.item()
gives you the value of the index.
为您提供索引的值。
Note that indices are notalways well defined not matter they are multi-indexed or single indexed. Modifying dataframes using indices might result in unexpected behavior. We will have an example with a multi-indexed case but note this is also true in a single-indexed case.
请注意,索引并不总是很好地定义,无论它们是多索引还是单索引。使用索引修改数据帧可能会导致意外行为。我们将有一个多索引案例的示例,但请注意这在单索引案例中也是如此。
Say we have
说我们有
df = pd.DataFrame({'x':[1,1,3,3], 'y':[3,3,5,5]}, index=[11,11,12,12]).stack()
11 x 1
y 3
x 1
y 3
12 x 3
y 5 # the index is (12, 'y')
x 3
y 5 # the index is also (12, 'y')
df.tail(1).index.item() # gives (12, 'y')
Trying to access the last element with the index df[12, "y"]yields
尝试使用索引访问最后一个元素会df[12, "y"]产生
(12, y) 5
(12, y) 5
dtype: int64
If you attempt to modify the dataframe based on the index (12, y), you will modify two rows rather than one. Thus, even though we learned to access the value of last row's index, it might not be a good idea if you want to change the values of last row based on its indexas there could be many that share the same index. You should use df.iloc[-1]to access last row in this case though.
如果您尝试根据 index 修改数据框(12, y),您将修改两行而不是一行。因此,即使我们学会了访问最后一行索引的值,如果您想根据其索引更改最后一行的值可能不是一个好主意,因为可能有许多共享相同的索引。df.iloc[-1]在这种情况下,您应该使用访问最后一行。
Reference
参考
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html
回答by yoonghm
It may be too late now, I use indexmethod to retrieve last index of a DataFrame, then use [-1]to get the last values:
现在可能为时已晚,我使用index方法检索 DataFrame 的最后一个索引,然后使用[-1]获取最后一个值:
For example,
例如,
df = pd.DataFrame(np.zeros((4, 1)), columns=['A'])
print(f'df:\n{df}\n')
print(f'Index = {df.index}\n')
print(f'Last index = {df.index[-1]}')
The output is
输出是
df:
A
0 0.0
1 0.0
2 0.0
3 0.0
Index = RangeIndex(start=0, stop=4, step=1)
Last index = 3
回答by grofte
You want .iloc with double brackets.
你想要带双括号的 .iloc。
import pandas as pd
df = pd.DataFrame({"date": range(10, 64, 8), "not_date": "fools"})
df.index += 17
df.iloc[[0,-1]][['date']]
You give .iloc a list of indexes - specifically the first and last, [0, -1]. That returns a dataframe from which you ask for the 'date' column. ['date'] will give you a series (yuck), and [['date']] will give you a dataframe.
你给 .iloc 一个索引列表——特别是第一个和最后一个,[0, -1]。这将返回一个数据框,您从中要求“日期”列。['date'] 会给你一个系列(yuck),而 [['date']] 会给你一个数据框。

