Python 数据框中最后一个元素的访问索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15862034/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-18 21:12:40  来源:igfitidea点击:

Access index of last element in data frame

pythonpandas

提问by elelias

I've looking around for this but I can't seem to find it (though it must be extremely trivial).

我四处寻找这个,但我似乎无法找到它(尽管它一定是非常微不足道的)。

The problem that I have is that I would like to retrieve the value of a column for the first and last entries of a data frame. But if I do:

我遇到的问题是我想为数据框的第一个和最后一个条目检索列的值。但如果我这样做:

df.ix[0]['date']

I get:

我得到:

datetime.datetime(2011, 1, 10, 16, 0)

but if I do:

但如果我这样做:

df[-1:]['date']

I get:

我得到:

myIndex
13         2011-12-20 16:00:00
Name: mydate

with a different format. Ideally, I would like to be able to access the value of the last index of the data frame, but I can't find how.

用不同的格式。理想情况下,我希望能够访问数据框的最后一个索引的值,但我找不到方法。

I even tried to create a column (IndexCopy) with the values of the index and try:

我什至尝试使用索引值创建一个列(IndexCopy)并尝试:

df.ix[df.tail(1)['IndexCopy']]['mydate']

but this also yields a different format (since df.tail(1)['IndexCopy'] does not output a simple integer).

但这也会产生不同的格式(因为 df.tail(1)['IndexCopy'] 不输出简单的整数)。

Any ideas?

有任何想法吗?

采纳答案by DSM

The former answer is now superseded by .iloc:

以前的答案现在被取代.iloc

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df["date"].iloc[0]
10
>>> df["date"].iloc[-1]
58


The shortest way I can think of uses .iget():

我能想到的最短方法是.iget()

>>> df = pd.DataFrame({"date": range(10, 64, 8)})
>>> df.index += 17
>>> df
    date
17    10
18    18
19    26
20    34
21    42
22    50
23    58
>>> df['date'].iget(0)
10
>>> df['date'].iget(-1)
58

Alternatively:

或者:

>>> df['date'][df.index[0]]
10
>>> df['date'][df.index[-1]]
58

There's also .first_valid_index()and .last_valid_index(), but depending on whether or not you want to rule out NaNs they might not be what you want.

还有.first_valid_index()and .last_valid_index(),但取决于您是否要排除NaNs 它们可能不是您想要的。

Remember that df.ix[0]doesn't give you the first, but the one indexed by 0. For example, in the above case, df.ix[0]would produce

请记住,df.ix[0]这不会为您提供第一个,而是由 0 索引的那个。例如,在上述情况下,df.ix[0]将产生

>>> df.ix[0]
Traceback (most recent call last):
  File "<ipython-input-489-494245247e87>", line 1, in <module>
    df.ix[0]
[...]
KeyError: 0

回答by comte

df.tail(1).index 

seems the most readable

似乎是最易读的

回答by Tai

Combining @comte's answer and dmdip's answer in Get index of a row of a pandas dataframe as an integer

结合@comte's answer和dmdip's answer in Get index of a row of a pandas dataframe as an integer

df.tail(1).index.item()

gives you the value of the index.

为您提供索引的值。



Note that indices are notalways well defined not matter they are multi-indexed or single indexed. Modifying dataframes using indices might result in unexpected behavior. We will have an example with a multi-indexed case but note this is also true in a single-indexed case.

请注意,索引并不总是很好地定义,无论它们是多索引还是单索引。使用索引修改数据帧可能会导致意外行为。我们将有一个多索引案例的示例,但请注意这在单索引案例中也是如此

Say we have

说我们有

df = pd.DataFrame({'x':[1,1,3,3], 'y':[3,3,5,5]}, index=[11,11,12,12]).stack()

11  x    1
    y    3
    x    1
    y    3
12  x    3
    y    5              # the index is (12, 'y')
    x    3
    y    5              # the index is also (12, 'y')

df.tail(1).index.item() # gives (12, 'y')

Trying to access the last element with the index df[12, "y"]yields

尝试使用索引访问最后一个元素会df[12, "y"]产生

(12, y)    5
(12, y)    5
dtype: int64

If you attempt to modify the dataframe based on the index (12, y), you will modify two rows rather than one. Thus, even though we learned to access the value of last row's index, it might not be a good idea if you want to change the values of last row based on its indexas there could be many that share the same index. You should use df.iloc[-1]to access last row in this case though.

如果您尝试根据 index 修改数据框(12, y),您将修改两行而不是一行。因此,即使我们学会了访问最后一行索引的值,如果您想根据其索引更改最后一行的值可能不是一个好主意,因为可能有许多共享相同的索引。df.iloc[-1]在这种情况下,您应该使用访问最后一行。

Reference

参考

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Index.item.html

回答by yoonghm

It may be too late now, I use indexmethod to retrieve last index of a DataFrame, then use [-1]to get the last values:

现在可能为时已晚,我使用index方法检索 DataFrame 的最后一个索引,然后使用[-1]获取最后一个值:

For example,

例如,

df = pd.DataFrame(np.zeros((4, 1)), columns=['A'])
print(f'df:\n{df}\n')

print(f'Index = {df.index}\n')
print(f'Last index = {df.index[-1]}')

The output is

输出是

df:
     A
0  0.0
1  0.0
2  0.0
3  0.0

Index = RangeIndex(start=0, stop=4, step=1)

Last index = 3

回答by grofte

You want .iloc with double brackets.

你想要带双括号的 .iloc。

import pandas as pd
df = pd.DataFrame({"date": range(10, 64, 8), "not_date": "fools"})
df.index += 17
df.iloc[[0,-1]][['date']]

You give .iloc a list of indexes - specifically the first and last, [0, -1]. That returns a dataframe from which you ask for the 'date' column. ['date'] will give you a series (yuck), and [['date']] will give you a dataframe.

你给 .iloc 一个索引列表——特别是第一个和最后一个,[0, -1]。这将返回一个数据框,您从中要求“日期”列。['date'] 会给你一个系列(yuck),而 [['date']] 会给你一个数据框。