Python 如何删除熊猫数据框的最后一行数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/26921651/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 01:10:13  来源:igfitidea点击:

How to delete the last row of data of a pandas dataframe

pythonpandas

提问by tumultous_rooster

I think this should be simple, but I tried a few ideas and none of them worked:

我认为这应该很简单,但我尝试了一些想法,但都没有奏效:

last_row = len(DF)
DF = DF.drop(DF.index[last_row])  #<-- fail!

I tried using negative indices but that also lead to errors. I must still be misunderstanding something basic.

我尝试使用负指数,但这也会导致错误。我一定仍然误解了一些基本的东西。

回答by ely

Since index positioning in Python is 0-based, there won't actually be an element in indexat the location corresponding to len(DF). You need that to be last_row = len(DF) - 1:

由于 Python 中的索引定位是基于 0 的,因此index在对应于len(DF). 你需要的是last_row = len(DF) - 1

In [49]: dfrm
Out[49]: 
          A         B         C
0  0.120064  0.785538  0.465853
1  0.431655  0.436866  0.640136
2  0.445904  0.311565  0.934073
3  0.981609  0.695210  0.911697
4  0.008632  0.629269  0.226454
5  0.577577  0.467475  0.510031
6  0.580909  0.232846  0.271254
7  0.696596  0.362825  0.556433
8  0.738912  0.932779  0.029723
9  0.834706  0.002989  0.333436

[10 rows x 3 columns]

In [50]: dfrm.drop(dfrm.index[len(dfrm)-1])
Out[50]: 
          A         B         C
0  0.120064  0.785538  0.465853
1  0.431655  0.436866  0.640136
2  0.445904  0.311565  0.934073
3  0.981609  0.695210  0.911697
4  0.008632  0.629269  0.226454
5  0.577577  0.467475  0.510031
6  0.580909  0.232846  0.271254
7  0.696596  0.362825  0.556433
8  0.738912  0.932779  0.029723

[9 rows x 3 columns]

However, it's much simpler to just write DF[:-1].

但是,只需编写DF[:-1].

回答by Kane Chew

To drop last n rows:

删除最后 n 行:

df.drop(df.tail(n).index,inplace=True) # drop last n rows

By the same vein, you can drop first n rows:

同样,您可以删除前 n 行:

df.drop(df.head(n).index,inplace=True) # drop first n rows

回答by blue-sky

DF[:-n]

where n is the last number of rows to drop.

其中 n 是最后要删除的行数。

To drop the last row :

删除最后一行:

DF = DF[:-1]

回答by PrimeTime

drop returns a new array so that is why it choked in the og post; I had a similar requirement to rename some column headers and deleted some rows due to an ill formed csv file converted to Dataframe, so after reading this post I used:

drop 返回一个新数组,这就是它在 og 帖子中窒息的原因;由于格式错误的csv文件转换为Dataframe,我有一个类似的要求来重命名一些列标题并删除一些行,所以在阅读这篇文章后我使用了:

newList = pd.DataFrame(newList)
newList.columns = ['Area', 'Price']
print(newList)
# newList = newList.drop(0)
# newList = newList.drop(len(newList))
newList = newList[1:-1]
print(newList)

and it worked great, as you can see with the two commented out lines above I tried the drop.() method and it work but not as kool and readable as using [n:-n], hope that helps someone, thanks.

它工作得很好,正如你在上面两行注释掉的行中看到的那样,我尝试了 drop.() 方法,它工作但不像使用 [n:-n] 那样酷和易读,希望对某人有所帮助,谢谢。

回答by Riz.Khan

stats = pd.read_csv("C:\py\programs\second pandas\ex.csv")

The Output of stats:

统计输出:

       A            B          C
0   0.120064    0.785538    0.465853
1   0.431655    0.436866    0.640136
2   0.445904    0.311565    0.934073
3   0.981609    0.695210    0.911697
4   0.008632    0.629269    0.226454
5   0.577577    0.467475    0.510031
6   0.580909    0.232846    0.271254
7   0.696596    0.362825    0.556433
8   0.738912    0.932779    0.029723
9   0.834706    0.002989    0.333436

just use skipfooter=1

只是使用 skipfooter=1

skipfooter : int, default 0

Number of lines at bottom of file to skip

跳过脚注:整数,默认为 0

要跳过的文件底部的行数

stats_2 = pd.read_csv("C:\py\programs\second pandas\ex.csv", skipfooter=1, engine='python')

Output of stats_2

stats_2 的输出

       A          B            C
0   0.120064    0.785538    0.465853
1   0.431655    0.436866    0.640136
2   0.445904    0.311565    0.934073
3   0.981609    0.695210    0.911697
4   0.008632    0.629269    0.226454
5   0.577577    0.467475    0.510031
6   0.580909    0.232846    0.271254
7   0.696596    0.362825    0.556433
8   0.738912    0.932779    0.029723

回答by theGirrafish

Surprised nobody brought this one up:

没想到没人提这个:

# To remove last n rows
df.head(-n)

# To remove first n rows
df.tail(-n)

Running a speed test on a DataFrame of 1000 rows shows that slicing and head/tailare ~6 times faster than using drop:

在 1000 行的 DataFrame 上运行速度测试表明切片和head/tail比使用快约 6 倍drop

>>> %timeit df[:-1]
125 μs ± 132 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit df.head(-1)
129 μs ± 1.18 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

>>> %timeit df.drop(df.tail(1).index)
751 μs ± 20.4 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)