Python 在熊猫中提取数据帧的第一行和最后一行
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36542169/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extract first and last row of a dataframe in pandas
提问by Bryan P
How can I extract the first and last rows of a given dataframe as a new dataframe in pandas?
如何在 Pandas 中提取给定数据帧的第一行和最后一行作为新数据帧?
I've tried to use iloc
to select the desired rows and then concat
as in:
我尝试使用iloc
来选择所需的行,然后concat
如下所示:
df=pd.DataFrame({'a':range(1,5), 'b':['a','b','c','d']})
pd.concat([df.iloc[0,:], df.iloc[-1,:]])
but this does not produce a pandas dataframe:
但这不会产生熊猫数据框:
a 1
b a
a 4
b d
dtype: object
回答by su79eu7k
I think the most simple way is .iloc[[0, -1]]
.
我认为最简单的方法是.iloc[[0, -1]]
。
df = pd.DataFrame({'a':range(1,5), 'b':['a','b','c','d']})
df2 = df.iloc[[0, -1]]
print df2
a b
0 1 a
3 4 d
回答by Colonel Beauvel
You can also use head
and tail
:
您还可以使用head
和tail
:
In [29]: pd.concat([df.head(1), df.tail(1)])
Out[29]:
a b
0 1 a
3 4 d
回答by joh-mue
The accepted answer duplicatesthe first row if the frame only contains a single row. If that's a concern
如果框架仅包含一行,则接受的答案将复制第一行。如果这是一个问题
df[0::len(df)-1 if len(df) > 1 else 1]
df[0::len(df)-1 if len(df) > 1 else 1]
works even for single row-dataframes.
甚至适用于单行数据帧。
For the following dataframe this will not create a duplicate:
对于以下数据框,这不会创建重复项:
df = pd.DataFrame({'a': [1], 'b':['a']})
df2 = df[0::len(df)-1 if len(df) > 1 else 1]
print df2
a b
0 1 a
whereas this does:
而这样做:
df3 = df.iloc[[0, -1]]
print df3
a b
0 1 a
0 1 a
because the single row is the first AND last row at the same time.
因为单行同时是第一行和最后一行。
回答by jezrael
I think you can try add parameter axis=1
to concat
, because output of df.iloc[0,:]
and df.iloc[-1,:]
are Series
and transpose by T
:
我认为您可以尝试将参数添加axis=1
到concat
,因为df.iloc[0,:]
和 的输出df.iloc[-1,:]
是Series
并转置为T
:
print df.iloc[0,:]
a 1
b a
Name: 0, dtype: object
print df.iloc[-1,:]
a 4
b d
Name: 3, dtype: object
print pd.concat([df.iloc[0,:], df.iloc[-1,:]], axis=1)
0 3
a 1 4
b a d
print pd.concat([df.iloc[0,:], df.iloc[-1,:]], axis=1).T
a b
0 1 a
3 4 d
回答by Mina Gabriel
Here is the same style as in large datasets:
这是与大型数据集相同的样式:
x = df[:5]
y = pd.DataFrame([['...']*df.shape[1]], columns=df.columns, index=['...'])
z = df[-5:]
frame = [x, y, z]
result = pd.concat(frame)
print(result)
Output:
输出:
date temp
0 1981-01-01 00:00:00 20.7
1 1981-01-02 00:00:00 17.9
2 1981-01-03 00:00:00 18.8
3 1981-01-04 00:00:00 14.6
4 1981-01-05 00:00:00 15.8
... ... ...
3645 1990-12-27 00:00:00 14
3646 1990-12-28 00:00:00 13.6
3647 1990-12-29 00:00:00 13.5
3648 1990-12-30 00:00:00 15.7
3649 1990-12-31 00:00:00 13