Python 从熊猫时间戳获取 MM-DD-YYYY
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19105976/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get MM-DD-YYYY from pandas Timestamp
提问by blaklaybul
dates seem to be a tricky thing in python, and I am having a lot of trouble simply stripping the date out of the pandas TimeStamp. I would like to get from 2013-09-29 02:34:44to simply 09-29-2013
日期在 python 中似乎是一件棘手的事情,我在简单地从熊猫时间戳中删除日期时遇到了很多麻烦。我想从2013-09-29 02:34:44到简单地09-29-2013
I have a dataframe with a column Created_date:
我有一个包含 Created_date 列的数据框:
Name: Created_Date, Length: 1162549, dtype: datetime64[ns]`
I have tried applying the .date()method on this Series, eg: df.Created_Date.date(), but I get the error AttributeError: 'Series' object has no attribute 'date'
我曾尝试.date()在本系列中应用该方法,例如:df.Created_Date.date(),但出现错误AttributeError: 'Series' object has no attribute 'date'
Can someone help me out?
有人可以帮我吗?
采纳答案by Phillip Cloud
mapover the elements:
map在元素上:
In [239]: from operator import methodcaller
In [240]: s = Series(date_range(Timestamp('now'), periods=2))
In [241]: s
Out[241]:
0 2013-10-01 00:24:16
1 2013-10-02 00:24:16
dtype: datetime64[ns]
In [238]: s.map(lambda x: x.strftime('%d-%m-%Y'))
Out[238]:
0 01-10-2013
1 02-10-2013
dtype: object
In [242]: s.map(methodcaller('strftime', '%d-%m-%Y'))
Out[242]:
0 01-10-2013
1 02-10-2013
dtype: object
You can get the raw datetime.dateobjects by calling the date()method of the Timestampelements that make up the Series:
您可以datetime.date通过调用date()构成 的Timestamp元素的方法来获取原始对象Series:
In [249]: s.map(methodcaller('date'))
Out[249]:
0 2013-10-01
1 2013-10-02
dtype: object
In [250]: s.map(methodcaller('date')).values
Out[250]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)
Yet anotherway you can do this is by calling the unbound Timestamp.datemethod:
然而,另一种方式,你可以做到这一点是通过调用未绑定Timestamp.date方法:
In [273]: s.map(Timestamp.date)
Out[273]:
0 2013-10-01
1 2013-10-02
dtype: object
This method is the fastest, and IMHO the most readable. Timestampis accessible in the top-level pandasmodule, like so: pandas.Timestamp. I've imported it directly for expository purposes.
这种方法是最快的,恕我直言是最易读的。Timestamp是顶级的访问pandas模块,像这样:pandas.Timestamp。为了说明目的,我直接导入了它。
The dateattribute of DatetimeIndexobjects does something similar, but returns a numpyobject array instead:
对象的date属性DatetimeIndex做类似的事情,但返回一个numpy对象数组:
In [243]: index = DatetimeIndex(s)
In [244]: index
Out[244]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-10-01 00:24:16, 2013-10-02 00:24:16]
Length: 2, Freq: None, Timezone: None
In [246]: index.date
Out[246]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)
For larger datetime64[ns]Seriesobjects, calling Timestamp.dateis faster than operator.methodcallerwhich is slightly faster than a lambda:
对于较大的datetime64[ns]Series对象,调用Timestamp.date快于operator.methodcaller它稍快于lambda:
In [263]: f = methodcaller('date')
In [264]: flam = lambda x: x.date()
In [265]: fmeth = Timestamp.date
In [266]: s2 = Series(date_range('20010101', periods=1000000, freq='T'))
In [267]: s2
Out[267]:
0 2001-01-01 00:00:00
1 2001-01-01 00:01:00
2 2001-01-01 00:02:00
3 2001-01-01 00:03:00
4 2001-01-01 00:04:00
5 2001-01-01 00:05:00
6 2001-01-01 00:06:00
7 2001-01-01 00:07:00
8 2001-01-01 00:08:00
9 2001-01-01 00:09:00
10 2001-01-01 00:10:00
11 2001-01-01 00:11:00
12 2001-01-01 00:12:00
13 2001-01-01 00:13:00
14 2001-01-01 00:14:00
...
999985 2002-11-26 10:25:00
999986 2002-11-26 10:26:00
999987 2002-11-26 10:27:00
999988 2002-11-26 10:28:00
999989 2002-11-26 10:29:00
999990 2002-11-26 10:30:00
999991 2002-11-26 10:31:00
999992 2002-11-26 10:32:00
999993 2002-11-26 10:33:00
999994 2002-11-26 10:34:00
999995 2002-11-26 10:35:00
999996 2002-11-26 10:36:00
999997 2002-11-26 10:37:00
999998 2002-11-26 10:38:00
999999 2002-11-26 10:39:00
Length: 1000000, dtype: datetime64[ns]
In [269]: timeit s2.map(f)
1 loops, best of 3: 1.04 s per loop
In [270]: timeit s2.map(flam)
1 loops, best of 3: 1.1 s per loop
In [271]: timeit s2.map(fmeth)
1 loops, best of 3: 968 ms per loop
Keep in mind that one of the goals of pandasis to provide a layer on top of numpyso that (most of the time) you don't have to deal with the low level details of the ndarray. So getting the raw datetime.dateobjects in an array is of limited use since they don't correspond to any numpy.dtypethat is supported by pandas(pandasonly supports datetime64[ns][that's nanoseconds] dtypes). That said, sometimes you need to do this.
请记住, 的目标之一pandas是在其顶部提供一个层,numpy以便(大多数情况下)您不必处理ndarray. 因此让原料datetime.date在阵列中的对象是有限的用途,因为它们并不对应于任何numpy.dtype所支持通过pandas(pandas仅支持datetime64[ns][即的纳秒] dtypes)。也就是说,有时您需要这样做。
回答by drenerbas
Maybe this only came in recently, but there are built-in methods for this. Try:
也许这只是最近才出现的,但有内置的方法。尝试:
In [27]: s = pd.Series(pd.date_range(pd.Timestamp('now'), periods=2))
In [28]: s
Out[28]:
0 2016-02-11 19:11:43.386016
1 2016-02-12 19:11:43.386016
dtype: datetime64[ns]
In [29]: s.dt.to_pydatetime()
Out[29]:
array([datetime.datetime(2016, 2, 11, 19, 11, 43, 386016),
datetime.datetime(2016, 2, 12, 19, 11, 43, 386016)], dtype=object)
回答by student
You can try using .dt.dateon datetime64[ns]of the dataframe.
您可以尝试使用.dt.date上datetime64[ns]的dataframe。
For e.g. df['Created_date'] = df['Created_date'].dt.date
例如 df['Created_date'] = df['Created_date'].dt.date
Input dataframenamed as test_df:
输入dataframe命名为test_df:
print(test_df)
Result:
结果:
Created_date
0 2015-03-04 15:39:16
1 2015-03-22 17:36:49
2 2015-03-25 22:08:45
3 2015-03-16 13:45:20
4 2015-03-19 18:53:50
Checking dtypes:
检查dtypes:
print(test_df.dtypes)
Result:
结果:
Created_date datetime64[ns]
dtype: object
Extracting dateand updating Created_datecolumn:
提取date和更新Created_date列:
test_df['Created_date'] = test_df['Created_date'].dt.date
print(test_df)
Result:
结果:
Created_date
0 2015-03-04
1 2015-03-22
2 2015-03-25
3 2015-03-16
4 2015-03-19
回答by Charles Haynes
well I would do this way.
好吧,我会这样做。
pdTime =pd.date_range(timeStamp, periods=len(years), freq="D")
pdTime[i].strftime('%m-%d-%Y')

