在 Pandas 数据框中将不同的日期时间格式转换为 MM/DD/YYYY 格式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/45531489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Converting different date time formats to MM/DD/YYYY format in pandas dataframe
提问by Chris T.
I have a date column in a pandas.DataFrame
in various date time formats and stored as list object, like the following:
我有一个pandas.DataFrame
各种日期时间格式的日期列,并存储为列表对象,如下所示:
date
1 [May 23rd, 2011]
2 [January 1st, 2010]
...
99 [Apr. 15, 2008]
100 [07-11-2013]
...
256 [9/01/1995]
257 [04/15/2000]
258 [11/22/68]
...
360 [12/1997]
361 [08/2002]
...
463 [2014]
464 [2016]
For the sake of convenience, I want to convert them all to MM/DD/YYYY
format. It doesn't seem possible to use regex replace() function to do this, since one cannot execute this operation over list objects. Also, to use strptime() for each cell will be too time-consuming.
为了方便起见,我想将它们全部转换为MM/DD/YYYY
格式。似乎不可能使用正则表达式 replace() 函数来执行此操作,因为无法对列表对象执行此操作。此外,为每个单元格使用 strptime() 将太耗时。
What will be the easier way to convert them all to the desired MM/DD/YYYY
format? I found it very hard to do this on list objects within a dataframe.
将它们全部转换为所需MM/DD/YYYY
格式的更简单方法是什么?我发现在数据框中的列表对象上很难做到这一点。
Note: for cell values of the form [YYYY]
(e.g., [2014]
and [2016]
), I will assume they are the first day of that year (i.e., January 1, 1968) and for cell values such as [08/2002]
(or [8/2002]
), I will assume they the first day of the month of that year (i.e., August 1, 2002).
注意:对于表单的单元格值[YYYY]
(例如,[2014]
和[2016]
),我将假设它们是那一年的第一天(即 1968 年 1 月 1 日),对于诸如[08/2002]
(或[8/2002]
) 的单元格值,我将假设它们是第一天当年的月份(即 2002 年 8 月 1 日)。
回答by Stephen Rauch
Given your sample data, with the addition of a NaT
, this works:
给定您的示例数据,加上 a NaT
,这有效:
Code:
代码:
df.date.apply(lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])
Test Code:
测试代码:
import pandas as pd
df = pd.DataFrame([
[['']],
[['May 23rd, 2011']],
[['January 1st, 2010']],
[['Apr. 15, 2008']],
[['07-11-2013']],
[['9/01/1995']],
[['04/15/2000']],
[['11/22/68']],
[['12/1997']],
[['08/2002']],
[['2014']],
[['2016']],
], columns=['date'])
df['clean_date'] = df.date.apply(
lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])
print(df)
Results:
结果:
date clean_date
0 [] NaT
1 [May 23rd, 2011] 05/23/2011
2 [January 1st, 2010] 01/01/2010
3 [Apr. 15, 2008] 04/15/2008
4 [07-11-2013] 07/11/2013
5 [9/01/1995] 09/01/1995
6 [04/15/2000] 04/15/2000
7 [11/22/68] 11/22/1968
8 [12/1997] 12/01/1997
9 [08/2002] 08/01/2002
10 [2014] 01/01/2014
11 [2016] 01/01/2016
回答by Ashu007
It would be better if you use this it'll give you the date format in MM-DD-YYYY the you can apply strftime:
如果你使用它会更好,它会给你 MM-DD-YYYY 的日期格式,你可以应用 strftime:
df['Date_ColumnName'] = pd.to_datetime(df['Date_ColumnName'], dayfirst = False, yearfirst = False)
回答by Sarender Reddy
Provided code will work for following scenarios.
提供的代码适用于以下场景。
- Change date format from M/D/YY to MM/DD/YY (5/2/2009 to 05/02/2009)
- change form ANY FORMAT to MM/DD/YY
- 将日期格式从 M/D/YY 更改为 MM/DD/YY (5/2/2009 到 05/02/2009)
- 将表格任何格式更改为 MM/DD/YY
import pandas as pd
将Pandas导入为 pd
'''
* checking provided input file date format correct or not
* if format is correct change date format from M/D/YY to MM/DD/YY
* else date format is not correct in input file
Date format change form ANY FORMAT to MM/DD/YY
'''
input_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/predictions.csv'
dest_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/Enrich.csv'
#input_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/enrichment.csv'
read_data = pd.read_csv(input_file_name)
print(pd.to_datetime(read_data['Date'], format='%m/%d/%Y', errors='coerce').notnull().all())
if pd.to_datetime(read_data['Date'], format='%m/%d/%Y', errors='coerce').notnull().all():
print("Provided correct input date format in input file....!")
read_data['Date'] = pd.to_datetime(read_data['Date'],format='%m/%d/%Y')
read_data['Date'] = read_data['Date'].dt.strftime('%m/%d/%Y')
read_data.to_csv(dest_file_name,index=False)
print(read_data['Date'])
else:
print("NOT... Provided correct input date format in input file....!")
data_format = pd.read_csv(input_file_name,parse_dates=['Date'], dayfirst=True)
#print(df['Date'])
data_format['Date'] = pd.to_datetime(data_format['Date'],format='%m/%d/%Y')
data_format['Date'] = data_format['Date'].dt.strftime('%m/%d/%Y')
data_format.to_csv(dest_file_name,index=False)
print(data_format['Date'])