在 Pandas 数据框中将不同的日期时间格式转换为 MM/DD/YYYY 格式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/45531489/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:11:28  来源:igfitidea点击:

Converting different date time formats to MM/DD/YYYY format in pandas dataframe

pythonlistpandasdatetimedataframe

提问by Chris T.

I have a date column in a pandas.DataFramein various date time formats and stored as list object, like the following:

我有一个pandas.DataFrame各种日期时间格式的日期列,并存储为列表对象,如下所示:

            date
1    [May 23rd, 2011]
2    [January 1st, 2010]
    ...
99   [Apr. 15, 2008]
100  [07-11-2013]
    ...
256  [9/01/1995]
257  [04/15/2000]
258  [11/22/68]
    ...
360  [12/1997]
361  [08/2002]
     ...
463  [2014]
464  [2016]

For the sake of convenience, I want to convert them all to MM/DD/YYYYformat. It doesn't seem possible to use regex replace() function to do this, since one cannot execute this operation over list objects. Also, to use strptime() for each cell will be too time-consuming.

为了方便起见,我想将它们全部转换为MM/DD/YYYY格式。似乎不可能使用正则表达式 replace() 函数来执行此操作,因为无法对列表对象执行此操作。此外,为每个单元格使用 strptime() 将太耗时。

What will be the easier way to convert them all to the desired MM/DD/YYYYformat? I found it very hard to do this on list objects within a dataframe.

将它们全部转换为所需MM/DD/YYYY格式的更简单方法是什么?我发现在数据框中的列表对象上很难做到这一点。

Note: for cell values of the form [YYYY](e.g., [2014]and [2016]), I will assume they are the first day of that year (i.e., January 1, 1968) and for cell values such as [08/2002](or [8/2002]), I will assume they the first day of the month of that year (i.e., August 1, 2002).

注意:对于表单的单元格值[YYYY](例如,[2014][2016]),我将假设它们是那一年的第一天(即 1968 年 1 月 1 日),对于诸如[08/2002](或[8/2002]) 的单元格值,我将假设它们是第一天当年的月份(即 2002 年 8 月 1 日)。

回答by Stephen Rauch

Given your sample data, with the addition of a NaT, this works:

给定您的示例数据,加上 a NaT,这有效:

Code:

代码:

df.date.apply(lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

Test Code:

测试代码:

import pandas as pd

df = pd.DataFrame([
    [['']],
    [['May 23rd, 2011']],
    [['January 1st, 2010']],
    [['Apr. 15, 2008']],
    [['07-11-2013']],
    [['9/01/1995']],
    [['04/15/2000']],
    [['11/22/68']],
    [['12/1997']],
    [['08/2002']],
    [['2014']],
    [['2016']],
], columns=['date'])

df['clean_date'] = df.date.apply(
    lambda x: pd.to_datetime(x).strftime('%m/%d/%Y')[0])

print(df)

Results:

结果:

                   date  clean_date
0                    []         NaT
1      [May 23rd, 2011]  05/23/2011
2   [January 1st, 2010]  01/01/2010
3       [Apr. 15, 2008]  04/15/2008
4          [07-11-2013]  07/11/2013
5           [9/01/1995]  09/01/1995
6          [04/15/2000]  04/15/2000
7            [11/22/68]  11/22/1968
8             [12/1997]  12/01/1997
9             [08/2002]  08/01/2002
10               [2014]  01/01/2014
11               [2016]  01/01/2016

回答by Ashu007

It would be better if you use this it'll give you the date format in MM-DD-YYYY the you can apply strftime:

如果你使用它会更好,它会给你 MM-DD-YYYY 的日期格式,你可以应用 strftime:

df['Date_ColumnName'] = pd.to_datetime(df['Date_ColumnName'], dayfirst = False, yearfirst = False)

回答by Sarender Reddy

Provided code will work for following scenarios.

提供的代码适用于以下场景。

  • Change date format from M/D/YY to MM/DD/YY (5/2/2009 to 05/02/2009)
  • change form ANY FORMAT to MM/DD/YY
  • 将日期格式从 M/D/YY 更改为 MM/DD/YY (5/2/2009 到 05/02/2009)
  • 将表格任何格式更改为 MM/DD/YY

import pandas as pd

将Pandas导入为 pd

'''
       * checking provided input file date format correct or not
       * if format is correct change date format from M/D/YY to MM/DD/YY
       * else date format is not correct in input file
         Date format  change form ANY FORMAT to MM/DD/YY
  '''
input_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/predictions.csv'
dest_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/Enrich.csv'
#input_file_name = 'C:/Users/Admin/Desktop/SarenderReddy/enrichment.csv'
read_data = pd.read_csv(input_file_name)
print(pd.to_datetime(read_data['Date'], format='%m/%d/%Y', errors='coerce').notnull().all())

if pd.to_datetime(read_data['Date'], format='%m/%d/%Y', errors='coerce').notnull().all():
    print("Provided correct input date format in input file....!")
    read_data['Date'] = pd.to_datetime(read_data['Date'],format='%m/%d/%Y')
    read_data['Date'] = read_data['Date'].dt.strftime('%m/%d/%Y')
    read_data.to_csv(dest_file_name,index=False)
    print(read_data['Date'])
else:
    print("NOT... Provided correct input date format in input file....!")
    data_format = pd.read_csv(input_file_name,parse_dates=['Date'], dayfirst=True)
    #print(df['Date'])
    data_format['Date'] = pd.to_datetime(data_format['Date'],format='%m/%d/%Y')
    data_format['Date'] = data_format['Date'].dt.strftime('%m/%d/%Y')
    data_format.to_csv(dest_file_name,index=False)
    print(data_format['Date'])