在有空值的日期上使用 lambda 和 strftime (Pandas)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35488036/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 00:43:26  来源:igfitidea点击:

Using lambda and strftime on dates when there are null values (Pandas)

pythonpandaslambdastrftime

提问by FortuneFaded

I'm trying to change the format of a datetime column in my Dataframe using lambda and strftime like below

我正在尝试使用如下所示的 lambda 和 strftime 更改我的 Dataframe 中日期时间列的格式

df['Date Column'] = df['Date Column'].map(lambda x: x.strftime('%m/%d/%Y'))

However, since I have null values in some of these fields, this is giving me an error. I cannot drop these null rows because I still need them for the data in the other columns. Is there a way around this error without dropping the nulls.

但是,由于我在其中一些字段中有空值,这给了我一个错误。我无法删除这些空行,因为其他列中的数据仍然需要它们。有没有办法在不删除空值的情况下解决这个错误。

Perhaps something like

也许像

df['Date Column'].map(lambda x: x.strftime('%m/%d/%Y') if x != null else "")

?

?

The method I've used is to drop the nulls, format the column, then merge it back onto the original dataset, but this seems like a very inefficient method.

我使用的方法是删除空值,格式化列,然后将其合并回原始数据集,但这似乎是一种非常低效的方法。

采纳答案by Stop harming Monica

You should be not checking for nan/nat (un)equality, but .notnull()should work and it does for me:

你不应该检查 nan/nat (un)equality,但.notnull()应该工作,它对我有用:

s = pd.date_range('2000-01-01', periods=5).to_series().reset_index(drop=True)
s[2] = None
s

0   2000-01-01
1   2000-01-02
2          NaT
3   2000-01-04
4   2000-01-05
dtype: datetime64[ns]

s.map(lambda x: x.strftime('%m/%d/%Y') if pd.notnull(x) else '')

0    01/01/2000
1    01/02/2000
2              
3    01/04/2000
4    01/05/2000
dtype: object

This returns the same that the answers by @Alexander and @Batman but is more explicit. It may also be slightly slower for large series.

这返回与@Alexander 和@Batman 的答案相同,但更明确。对于大系列,它也可能稍微慢一些。

Alternatively you can use the .dtaccesor. The null values will be formatted as NaT.

或者,您可以使用.dtaccesor。空值将被格式化为NaT.

s.dt.strftime('%m/%d/%Y')

0    01/01/2000
1    01/02/2000
2           NaT
3    01/04/2000
4    01/05/2000
dtype: object

回答by Batman

Personally I'd just define a small function, and then use that.

我个人只是定义一个小函数,然后使用它。

def to_string(date):
    if date:
        string = date.strftime('%Y%m%d')
    else:
        string = ""

    return string

Then

然后

df['Date Column'].map(to_string) 

Otherwise

除此以外

df['Date Column'].map(lambda x: x.strftime('%Y%m%d') if x else "")

回答by Alexander

You can use a conditional assignment (ternary).

您可以使用条件赋值(三元)。

df['Date Column'] = df['Date Column'].map(lambda x: x.strftime('%m/%d/%Y') if x else '')