Python 从熊猫时间戳获取 MM-DD-YYYY
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/19105976/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Get MM-DD-YYYY from pandas Timestamp
提问by blaklaybul
dates seem to be a tricky thing in python, and I am having a lot of trouble simply stripping the date out of the pandas TimeStamp. I would like to get from 2013-09-29 02:34:44
to simply 09-29-2013
日期在 python 中似乎是一件棘手的事情,我在简单地从熊猫时间戳中删除日期时遇到了很多麻烦。我想从2013-09-29 02:34:44
到简单地09-29-2013
I have a dataframe with a column Created_date:
我有一个包含 Created_date 列的数据框:
Name: Created_Date, Length: 1162549, dtype: datetime64[ns]`
I have tried applying the .date()
method on this Series, eg: df.Created_Date.date()
, but I get the error AttributeError: 'Series' object has no attribute 'date'
我曾尝试.date()
在本系列中应用该方法,例如:df.Created_Date.date()
,但出现错误AttributeError: 'Series' object has no attribute 'date'
Can someone help me out?
有人可以帮我吗?
采纳答案by Phillip Cloud
map
over the elements:
map
在元素上:
In [239]: from operator import methodcaller
In [240]: s = Series(date_range(Timestamp('now'), periods=2))
In [241]: s
Out[241]:
0 2013-10-01 00:24:16
1 2013-10-02 00:24:16
dtype: datetime64[ns]
In [238]: s.map(lambda x: x.strftime('%d-%m-%Y'))
Out[238]:
0 01-10-2013
1 02-10-2013
dtype: object
In [242]: s.map(methodcaller('strftime', '%d-%m-%Y'))
Out[242]:
0 01-10-2013
1 02-10-2013
dtype: object
You can get the raw datetime.date
objects by calling the date()
method of the Timestamp
elements that make up the Series
:
您可以datetime.date
通过调用date()
构成 的Timestamp
元素的方法来获取原始对象Series
:
In [249]: s.map(methodcaller('date'))
Out[249]:
0 2013-10-01
1 2013-10-02
dtype: object
In [250]: s.map(methodcaller('date')).values
Out[250]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)
Yet anotherway you can do this is by calling the unbound Timestamp.date
method:
然而,另一种方式,你可以做到这一点是通过调用未绑定Timestamp.date
方法:
In [273]: s.map(Timestamp.date)
Out[273]:
0 2013-10-01
1 2013-10-02
dtype: object
This method is the fastest, and IMHO the most readable. Timestamp
is accessible in the top-level pandas
module, like so: pandas.Timestamp
. I've imported it directly for expository purposes.
这种方法是最快的,恕我直言是最易读的。Timestamp
是顶级的访问pandas
模块,像这样:pandas.Timestamp
。为了说明目的,我直接导入了它。
The date
attribute of DatetimeIndex
objects does something similar, but returns a numpy
object array instead:
对象的date
属性DatetimeIndex
做类似的事情,但返回一个numpy
对象数组:
In [243]: index = DatetimeIndex(s)
In [244]: index
Out[244]:
<class 'pandas.tseries.index.DatetimeIndex'>
[2013-10-01 00:24:16, 2013-10-02 00:24:16]
Length: 2, Freq: None, Timezone: None
In [246]: index.date
Out[246]:
array([datetime.date(2013, 10, 1), datetime.date(2013, 10, 2)], dtype=object)
For larger datetime64[ns]
Series
objects, calling Timestamp.date
is faster than operator.methodcaller
which is slightly faster than a lambda
:
对于较大的datetime64[ns]
Series
对象,调用Timestamp.date
快于operator.methodcaller
它稍快于lambda
:
In [263]: f = methodcaller('date')
In [264]: flam = lambda x: x.date()
In [265]: fmeth = Timestamp.date
In [266]: s2 = Series(date_range('20010101', periods=1000000, freq='T'))
In [267]: s2
Out[267]:
0 2001-01-01 00:00:00
1 2001-01-01 00:01:00
2 2001-01-01 00:02:00
3 2001-01-01 00:03:00
4 2001-01-01 00:04:00
5 2001-01-01 00:05:00
6 2001-01-01 00:06:00
7 2001-01-01 00:07:00
8 2001-01-01 00:08:00
9 2001-01-01 00:09:00
10 2001-01-01 00:10:00
11 2001-01-01 00:11:00
12 2001-01-01 00:12:00
13 2001-01-01 00:13:00
14 2001-01-01 00:14:00
...
999985 2002-11-26 10:25:00
999986 2002-11-26 10:26:00
999987 2002-11-26 10:27:00
999988 2002-11-26 10:28:00
999989 2002-11-26 10:29:00
999990 2002-11-26 10:30:00
999991 2002-11-26 10:31:00
999992 2002-11-26 10:32:00
999993 2002-11-26 10:33:00
999994 2002-11-26 10:34:00
999995 2002-11-26 10:35:00
999996 2002-11-26 10:36:00
999997 2002-11-26 10:37:00
999998 2002-11-26 10:38:00
999999 2002-11-26 10:39:00
Length: 1000000, dtype: datetime64[ns]
In [269]: timeit s2.map(f)
1 loops, best of 3: 1.04 s per loop
In [270]: timeit s2.map(flam)
1 loops, best of 3: 1.1 s per loop
In [271]: timeit s2.map(fmeth)
1 loops, best of 3: 968 ms per loop
Keep in mind that one of the goals of pandas
is to provide a layer on top of numpy
so that (most of the time) you don't have to deal with the low level details of the ndarray
. So getting the raw datetime.date
objects in an array is of limited use since they don't correspond to any numpy.dtype
that is supported by pandas
(pandas
only supports datetime64[ns]
[that's nanoseconds] dtypes). That said, sometimes you need to do this.
请记住, 的目标之一pandas
是在其顶部提供一个层,numpy
以便(大多数情况下)您不必处理ndarray
. 因此让原料datetime.date
在阵列中的对象是有限的用途,因为它们并不对应于任何numpy.dtype
所支持通过pandas
(pandas
仅支持datetime64[ns]
[即的纳秒] dtypes)。也就是说,有时您需要这样做。
回答by drenerbas
Maybe this only came in recently, but there are built-in methods for this. Try:
也许这只是最近才出现的,但有内置的方法。尝试:
In [27]: s = pd.Series(pd.date_range(pd.Timestamp('now'), periods=2))
In [28]: s
Out[28]:
0 2016-02-11 19:11:43.386016
1 2016-02-12 19:11:43.386016
dtype: datetime64[ns]
In [29]: s.dt.to_pydatetime()
Out[29]:
array([datetime.datetime(2016, 2, 11, 19, 11, 43, 386016),
datetime.datetime(2016, 2, 12, 19, 11, 43, 386016)], dtype=object)
回答by student
You can try using .dt.date
on datetime64[ns]
of the dataframe
.
您可以尝试使用.dt.date
上datetime64[ns]
的dataframe
。
For e.g. df['Created_date'] = df['Created_date'].dt.date
例如 df['Created_date'] = df['Created_date'].dt.date
Input dataframe
named as test_df
:
输入dataframe
命名为test_df
:
print(test_df)
Result:
结果:
Created_date
0 2015-03-04 15:39:16
1 2015-03-22 17:36:49
2 2015-03-25 22:08:45
3 2015-03-16 13:45:20
4 2015-03-19 18:53:50
Checking dtypes
:
检查dtypes
:
print(test_df.dtypes)
Result:
结果:
Created_date datetime64[ns]
dtype: object
Extracting date
and updating Created_date
column:
提取date
和更新Created_date
列:
test_df['Created_date'] = test_df['Created_date'].dt.date
print(test_df)
Result:
结果:
Created_date
0 2015-03-04
1 2015-03-22
2 2015-03-25
3 2015-03-16
4 2015-03-19
回答by Charles Haynes
well I would do this way.
好吧,我会这样做。
pdTime =pd.date_range(timeStamp, periods=len(years), freq="D")
pdTime[i].strftime('%m-%d-%Y')