pandas 在熊猫数据框中将字符串日期转换为不同的格式

Question

提问by racekiller

I have been looking for this answer in the community so far, could not have.

到目前为止，我一直在社区中寻找这个答案，找不到。

I have a dataframe in python 3.5.1 that contains a column with dates in string imported from a CSV file.

我在 python 3.5.1 中有一个数据框，其中包含一个从 CSV 文件导入的字符串中的日期列。

The dataframe looks like this

数据框看起来像这样

                  TimeStamp  TBD  TBD     Value  TBD
0       2016/06/08 17:19:53  NaN  NaN  0.062942  NaN
1       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN
2       2016/06/08 17:19:54  NaN  NaN  0.062942  NaN

what I need is to change the TimeStamp column format to be %m/%d/%y %H:%M:%D

我需要的是将时间戳列格式更改为 %m/%d/%y %H:%M:%D

                  TimeStamp  TBD  TBD     Value  TBD
0       06/08/2016 17:19:53  NaN  NaN  0.062942  NaN

So far I have found some solutions that works but for string and not for series

到目前为止，我已经找到了一些适用于字符串而不适用于系列的解决方案

Any help would be appreciated

任何帮助，将不胜感激

Thanks

谢谢

Answer 1

回答by unutbu

If you convert the column of strings to a time series, you could use the dt.strftimemethod:

如果将字符串列转换为时间序列，则可以使用以下dt.strftime方法：

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')
print(df)

yields

产量

   TBD  TBD.1  TBD.2            TimeStamp     Value
0  NaN    NaN    NaN  06/08/2016 17:19:53  0.062942
1  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942
2  NaN    NaN    NaN  06/08/2016 17:19:54  0.062942

Since you want to convert a column of strings to another (different) column of strings, you could also use the vectorized str.replacemethod:

由于要将一列字符串转换为另一列（不同的）字符串，您还可以使用矢量化str.replace方法：

import numpy as np
import pandas as pd
nan = np.nan
df = pd.DataFrame({'TBD': [nan, nan, nan], 'TBD.1': [nan, nan, nan], 'TBD.2': [nan, nan, nan], 'TimeStamp': ['2016/06/08 17:19:53', '2016/06/08 17:19:54', '2016/06/08 17:19:54'], 'Value': [0.062941999999999998, 0.062941999999999998, 0.062941999999999998]})
df['TimeStamp'] = df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'//')
print(df)

since

自从

In [32]: df['TimeStamp'].str.replace(r'(\d+)/(\d+)/(\d+)(.*)', r'//')
Out[32]: 
0    06/08/2016 17:19:53
1    06/08/2016 17:19:54
2    06/08/2016 17:19:54
Name: TimeStamp, dtype: object

This uses regex to rearrange pieces of the string without first parsing the string as a date. This is faster than the first method (mainly because it skips the parsing step), but it also has the disadvantage of not checking that the date strings are valid dates.

这使用 regex 重新排列字符串的各个部分，而无需先将字符串解析为 date。这比第一种方法快（主要是因为它跳过了解析步骤），但它也有不检查日期字符串是否为有效日期的缺点。

Answer 2

回答by Sarah

For most common date and datetime formats, pandas .to_datetimefunction can parse them without we providing format. For example:

对于大多数常见的日期和日期时间格式，pandas.to_datetime函数可以在不提供格式的情况下解析它们。例如：

df.TimeStamp.apply(lambda x: pd.to_datetime(x))

And in the example given from the question,

在问题给出的例子中，

df['TimeStamp'] = pd.to_datetime(df['TimeStamp']).dt.strftime('%m/%d/%Y %H:%M:%S')

will give us the same result.

会给我们同样的结果。

Using .applywill be efficient if you have multiple columns.

.apply如果您有多个列，使用将是有效的。

Of course, providing the parsing format is necessary for many situations. For a full list of formats, please see https://docs.python.org/3/library/datetime.html.

当然，在很多情况下，提供解析格式是必要的。有关格式的完整列表，请参阅https://docs.python.org/3/library/datetime.html。

pandas 在熊猫数据框中将字符串日期转换为不同的格式

提问by racekiller

回答by unutbu

回答by Sarah

相关推荐

最近更新

标签

pandas 在熊猫数据框中将字符串日期转换为不同的格式

提问by racekiller

回答by unutbu

回答by Sarah

相关推荐

在 Pandas 中同时使用 loc 和 iloc

pandas 熊猫子集并根据列值删除行

pandas 从熊猫的字符串中删除字符

pandas 如何在具有不同 Y 轴的同一个 seaborn 图中很好地制作条形图和线图？

相关推荐

最近更新

标签