Extracting hour, date and month from a pandas (python) DataFrame object
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27042575/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Extracting hour, date and month from a pandas (python) DataFrame object
提问by Duna
With a pandas DataFrame df:
With a pandas DataFrame df:
x y
0 29-04-2014 07:40:52 07:40:52
1 29-04-2014 07:59:25 07:15:00
2 NaN NaN
3 29-04-2014 08:16:12 07:45:00
4 09-05-2014 08:19:14 07:30:00
5 23-04-2014 08:27:16 08:30:00
6 29-04-2014 08:37:16 07:00:00
7 NaN NaN
8 29-04-2014 08:41:16 07:30:00
9 25-03-2014 08:42:16 07:30:00
where columns xcontains the Date-Month-Year Hour:Minute:Secondand ystores the Hour:Minute:Second, how can one extract only:
where columns xcontains the Date-Month-Year Hour:Minute:Secondand ystores the Hour:Minute:Second, how can one extract only:
- the date [
29from row 0] of columnx, - the month [
04orAprilfrom row 0] of columnx, - the month and date [
29-04or29-Aprilfrom row 0] of columnx, - the hour and minute [
07-40from row 0] of columnsxandy
- the date [
29from row 0] of columnx, - the month [
04orAprilfrom row 0] of columnx, - the month and date [
29-04or29-Aprilfrom row 0] of columnx, - the hour and minute [
07-40from row 0] of columnsxandy
I imported the DataFrame from a text file using
I imported the DataFrame from a text file using
df = pd.read_table("C:\data.txt, sep= '\t'")
df = pd.read_table("C:\data.txt, sep= '\t'")
but it was originally from MS Excel or MS Access.
but it was originally from MS Excel or MS Access.
When I run df.dtypes, I got
When I run df.dtypes, I got
x object
y object
dtype: object
I am working of Pandas version 0.14.1in Python 3.4.
I am working of Pandas version 0.14.1in Python 3.4.
Example DataFrame
Example DataFrame
import numpy as np
import pandas a pd
df = pd.DataFrame({'x': ['29-04-2014 07:40:52', np.nan, '29-04-2014 08:16:16','29-04-2014 08:19:56', '29-04-2014 08:27:20'],
'y': ['07:40:52', '07:15:00', np.nan, '07:45:00', '07:30:00']})
回答by JD Long
I think the Pandas way is to make x your index, then you can use some simple methods to extract what you want. The non-Pandas way is to use datetime module.
I think the Pandas way is to make x your index, then you can use some simple methods to extract what you want. The non-Pandas way is to use datetime module.
Pandas way... For background you can read the documentationaround timeseries data which is quite good.
Pandas way... For background you can read the documentationaround timeseries data which is quite good.
set up some example data:
set up some example data:
n = 10
df = pd.DataFrame(pd.date_range('1/1/2000', periods=n))
df.columns = ['x']
df['z'] = rand(n)
df.set_index('x', inplace=True)
print df
z
x
2000-01-01 0.863064
2000-01-02 0.980083
2000-01-03 0.278810
2000-01-04 0.960890
2000-01-05 0.309591
2000-01-06 0.662498
2000-01-07 0.802367
2000-01-08 0.403791
2000-01-09 0.981172
2000-01-10 0.342935
Note that it does not appear your data is set with x as the index. That's an important step.
Note that it does not appear your data is set with x as the index. That's an important step.
After you have your dates as a timestap you can access what you are after:
After you have your dates as a timestap you can access what you are after:
df['d'] = df.index.day
df['m'] = df.index.month
df['y'] = df.index.year
print df
z y d m
x
2000-01-01 0.863064 2000 1 1
2000-01-02 0.980083 2000 2 1
2000-01-03 0.278810 2000 3 1
2000-01-04 0.960890 2000 4 1
2000-01-05 0.309591 2000 5 1
2000-01-06 0.662498 2000 6 1
2000-01-07 0.802367 2000 7 1
2000-01-08 0.403791 2000 8 1
2000-01-09 0.981172 2000 9 1
2000-01-10 0.342935 2000 10 1

