Python 熊猫数据框按日期排序
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41433765/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
pandas dataframe sort by date
提问by ajax2000
I made a dataframe by importing a csv file. And converted the date column to datetime and made it the index. However, when sorting the index it doesn't produce the result I wanted
我通过导入一个 csv 文件制作了一个数据框。并将日期列转换为日期时间并使其成为索引。但是,在对索引进行排序时,它不会产生我想要的结果
print(df.head())
df['Date'] = pd.to_datetime(df['Date'])
df.index = df['Date']
del df['Date']
df.sort_index()
print(df.head())
Here's the result:
结果如下:
Date Last
0 2016-12-30 1.05550
1 2016-12-29 1.05275
2 2016-12-28 1.04610
3 2016-12-27 1.05015
4 2016-12-23 1.05005
Last
Date
2016-12-30 1.05550
2016-12-29 1.05275
2016-12-28 1.04610
2016-12-27 1.05015
2016-12-23 1.05005
The date actually goes back to 1999, so if I sort this by date, it should show the data in ascending order right?
日期实际上可以追溯到 1999 年,所以如果我按日期排序,它应该按升序显示数据对吗?
回答by Marjan Moderc
Just expanding MaxU's correct answer: you have used correct method, but, just as with many other pandas methods, you will have to "recreate" dataframe in order for desired changes to take effect. As the MaxU already suggested, this is achieved by typing the variable again (to "store" the output of the used method into the same variable), e.g.:
只是扩展 MaxU 的正确答案:您使用了正确的方法,但是,就像许多其他 Pandas 方法一样,您必须“重新创建”数据框以使所需的更改生效。正如 MaxU 已经建议的那样,这是通过再次键入变量来实现的(将所用方法的输出“存储”到同一变量中),例如:
df = df.sort_index()
df = df.sort_index()
or by harnessing the power of attribute inplace=True
, which is going to replace the content of the variable without need of redeclaring it.
或者通过利用属性的力量inplace=True
,它将替换变量的内容而无需重新声明它。
df.sort_index(inplace=True)
df.sort_index(inplace=True)
However, in my experience, I often feel "safer" using the first option. It also looks clearer and more normalized, since not all the methods offer the inplace
usage. But I all comes down to scripting sytle I guess...
但是,根据我的经验,我经常觉得使用第一个选项“更安全”。它看起来也更清晰、更规范,因为并非所有方法都提供这种inplace
用法。但我都归结为脚本风格,我想......
回答by Ishan Khatri
The data looks like this
数据看起来像这样
Date,Last
2016-12-30,1.05550
2016-12-29,1.05275
2016-12-28,1.04610
2016-12-27,1.05015
2016-12-23,1.05005
Read the data by using pandas
使用pandas读取数据
import pandas as pd
df = pd.read_csv('data',sep=',')
# Displays the data head
print (df.head())
Date Last
0 2016-12-30 1.05550
1 2016-12-29 1.05275
2 2016-12-28 1.04610
3 2016-12-27 1.05015
4 2016-12-23 1.05005
# Sort column with name Date
df = df.sort_values(by = 'Date')
Date Last
4 2016-12-23 1.05005
3 2016-12-27 1.05015
2 2016-12-28 1.04610
1 2016-12-29 1.05275
0 2016-12-30 1.05550
# reset the index
df.reset_index(inplace=True)
# Display the data head after index reset
index Date Last
0 4 2016-12-23 1.05005
1 3 2016-12-27 1.05015
2 2 2016-12-28 1.04610
3 1 2016-12-29 1.05275
4 0 2016-12-30 1.05550
# delete the index
del df['index']
# Display the data head
print (df.head())
Date Last
0 2016-12-23 1.05005
1 2016-12-27 1.05015
2 2016-12-28 1.04610
3 2016-12-29 1.05275
4 2016-12-30 1.05550