pandas 在熊猫数据框中将字符串日期转换为不同的格式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/38060172/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 01:28:13  来源:igfitidea点击:

Convert string date to a different format in pandas dataframe

pythonpandasdataframe

提问by racekiller

I have been looking for this answer in the community so far, could not have.

到目前为止,我一直在社区中寻找这个答案,找不到。

I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file.

我在 python 3.5.1 中有一个数据框,其中包含一个从 CSV 文件导入的字符串中的日期列。

The dataframe looks like this

数据框看起来像这样

                  TimeStamp  TBD  TBD     Value  TBD
0       2016/06/08 17:19:53  NaN  NaN  0.062942  NaN
1       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
2       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN

what I need is to change the TimeStamp column format to be %m/%d/%y %H:%M:%D

我需要的是将时间戳列格式更改为 %m/%d/%y %H:%M:%D

                  TimeStamp  TBD  TBD     Value  TBD
0       06/08/2016 17:19:53  NaN  NaN  0.062942  NaN

So far I have found some solutions that works but for string and not for series

到目前为止,我已经找到了一些适用于字符串而不适用于系列的解决方案

Any help would be appreciated

任何帮助,将不胜感激

Thanks

谢谢

回答by unutbu

If you convert the column of strings to a time series, you could use the dt.strftimemethod:

如果将字符串列转换为时间序列,则可以使用以下dt.strftime方法

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)

yields

产量

   TBD  TBD.1  TBD.2            TimeStamp     Value
0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942


Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replacemethod:

由于要将一列字符串转换为另一列(不同的)字符串,您还可以使用矢量化str.replace方法:

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'//')
print(df)

since

自从

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'//')
Out[32]: 
0    06/08/2016 17:19:53
1    06/08/2016 17:19:54
2    06/08/2016 17:19:54
Name: TimeStamp, dtype: object

This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

这使用 regex 重新排列字符串的各个部分,而无需先将字符串解析为 date。这比第一种方法快(主要是因为它跳过了解析步骤),但它也有不检查日期字符串是否为有效日期的缺点。

回答by Sarah

For most common date and datetime formats, pandas .to_datetimefunction can parse them without we providing format. For example:

对于大多数常见的日期和日期时间格式,pandas.to_datetime函数可以在不提供格式的情况下解析它们。例如:

df.TimeStamp.apply(lambda x: pd.to_datetime(x))

df.TimeStamp.apply(lambda x: pd.to_datetime(x))

And in the example given from the question,

在问题给出的例子中,

df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

will give us the same result.

会给我们同样的结果。

Using .applywill be efficient if you have multiple columns.

.apply如果您有多个列,使用将是有效的。

Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html.

当然,在很多情况下,提供解析格式是必要的。有关格式的完整列表,请参阅https://docs.python.org/3/library/datetime.html