Python 如何删除熊猫数据框的最后一行数据
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/26921651/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to delete the last row of data of a pandas dataframe
提问by tumultous_rooster
I think this should be simple, but I tried a few ideas and none of them worked:
我认为这应该很简单,但我尝试了一些想法,但都没有奏效:
last_row = len(DF)
DF = DF.drop(DF.index[last_row]) #<-- fail!
I tried using negative indices but that also lead to errors. I must still be misunderstanding something basic.
我尝试使用负指数,但这也会导致错误。我一定仍然误解了一些基本的东西。
回答by ely
Since index positioning in Python is 0-based, there won't actually be an element in indexat the location corresponding to len(DF). You need that to be last_row = len(DF) - 1:
由于 Python 中的索引定位是基于 0 的,因此index在对应于len(DF). 你需要的是last_row = len(DF) - 1:
In [49]: dfrm
Out[49]:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436
[10 rows x 3 columns]
In [50]: dfrm.drop(dfrm.index[len(dfrm)-1])
Out[50]:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
[9 rows x 3 columns]
However, it's much simpler to just write DF[:-1].
但是,只需编写DF[:-1].
回答by Kane Chew
To drop last n rows:
删除最后 n 行:
df.drop(df.tail(n).index,inplace=True) # drop last n rows
By the same vein, you can drop first n rows:
同样,您可以删除前 n 行:
df.drop(df.head(n).index,inplace=True) # drop first n rows
回答by blue-sky
DF[:-n]
where n is the last number of rows to drop.
其中 n 是最后要删除的行数。
To drop the last row :
删除最后一行:
DF = DF[:-1]
回答by PrimeTime
drop returns a new array so that is why it choked in the og post; I had a similar requirement to rename some column headers and deleted some rows due to an ill formed csv file converted to Dataframe, so after reading this post I used:
drop 返回一个新数组,这就是它在 og 帖子中窒息的原因;由于格式错误的csv文件转换为Dataframe,我有一个类似的要求来重命名一些列标题并删除一些行,所以在阅读这篇文章后我使用了:
newList = pd.DataFrame(newList)
newList.columns = ['Area', 'Price']
print(newList)
# newList = newList.drop(0)
# newList = newList.drop(len(newList))
newList = newList[1:-1]
print(newList)
and it worked great, as you can see with the two commented out lines above I tried the drop.() method and it work but not as kool and readable as using [n:-n], hope that helps someone, thanks.
它工作得很好,正如你在上面两行注释掉的行中看到的那样,我尝试了 drop.() 方法,它工作但不像使用 [n:-n] 那样酷和易读,希望对某人有所帮助,谢谢。
回答by Riz.Khan
stats = pd.read_csv("C:\py\programs\second pandas\ex.csv")
The Output of stats:
统计输出:
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
9 0.834706 0.002989 0.333436
just use skipfooter=1
只是使用 skipfooter=1
skipfooter : int, default 0
Number of lines at bottom of file to skip
跳过脚注:整数,默认为 0
要跳过的文件底部的行数
stats_2 = pd.read_csv("C:\py\programs\second pandas\ex.csv", skipfooter=1, engine='python')
Output of stats_2
stats_2 的输出
A B C
0 0.120064 0.785538 0.465853
1 0.431655 0.436866 0.640136
2 0.445904 0.311565 0.934073
3 0.981609 0.695210 0.911697
4 0.008632 0.629269 0.226454
5 0.577577 0.467475 0.510031
6 0.580909 0.232846 0.271254
7 0.696596 0.362825 0.556433
8 0.738912 0.932779 0.029723
回答by theGirrafish
Surprised nobody brought this one up:
没想到没人提这个:
# To remove last n rows
df.head(-n)
# To remove first n rows
df.tail(-n)
Running a speed test on a DataFrame of 1000 rows shows that slicing and head/tailare ~6 times faster than using drop:
在 1000 行的 DataFrame 上运行速度测试表明切片和head/tail比使用快约 6 倍drop:
>>> %timeit df[:-1]
125 μs ± 132 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.head(-1)
129 μs ± 1.18 μs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
>>> %timeit df.drop(df.tail(1).index)
751 μs ± 20.4 μs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

