Python-Pandas-Dataframe-datetime 转换不包括空值单元格

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/46069234/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-14 04:24:11  来源:igfitidea点击:

Python-Pandas-Dataframe-datetime conversion excluding null value cells

pythonpandasdatetimedataframenotnull

提问by Mike H

Thanks for taking the time to look at my question.

感谢您花时间看我的问题。

I try to convert two date columns in a pandas dataframe using the function below. I use this function, because the "Closed Date" has 4221 lines, so it should not crash on the null cells.

我尝试使用下面的函数转换Pandas数据框中的两个日期列。我使用这个函数,因为“关闭日期”有 4221 行,所以它不应该在空单元格上崩溃。

Ultimately, the change results into a dataframe of the original row numbers. So, I don't want to loose the rows that have null values at closed dates.

最终,更改会生成原始行号的数据框。因此,我不想丢失在关闭日期具有空值的行。

Dataframe overview:

数据框概述:

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4272 entries, 0 to 4271
Data columns (total 4 columns):
Created Date    4272 non-null object
Closed Date     4221 non-null object
Agency          4272 non-null object
Borough         4272 non-null object
dtypes: object(4)

designed function:

设计功能:

col='Closed Date'
df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda    x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))

generated error:

产生的错误:

TypeError                                 Traceback (most recent call  last)
<ipython-input-155-49014bb3ecb3> in <module>()
      9 
     10 col='Closed Date'
---> 11 df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda     x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))
     12 print(type(df[(df[col].notnull())]))

/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in     apply(self, func, axis, broadcast, raw, reduce, args, **kwds)
   4358                         f, axis,
   4359                         reduce=reduce,
-> 4360                         ignore_failures=ignore_failures)
   4361             else:
   4362                 return self._apply_broadcast(f, axis)

/anaconda/lib/python3.6/site-packages/pandas/core/frame.py in     _apply_standard(self, func, axis, ignore_failures, reduce)
   4454             try:
   4455                 for i, v in enumerate(series_gen):
-> 4456                     results[i] = func(v)
   4457                     keys.append(v.name)
   4458             except Exception as e:

<ipython-input-155-49014bb3ecb3> in <lambda>(x)
      9 
     10 col='Closed Date'
---> 11 df[(df[col].notnull())] = df[(df[col].notnull())].apply(lambda     x:datetime.datetime.strptime(x,'%m/%d/%Y %I:%M:%S %p'))
     12 print(type(df[(df[col].notnull())]))

TypeError: ('strptime() argument 1 must be str, not Series', 'occurred     at index Created Date')

采纳答案by jezrael

I think you need only to_datetime- it convert NaNto NaT, so all values are datetimes in column:

我认为您只需要to_datetime- 它转换NaNNaT,因此所有值都是列中的日期时间:

col='Closed Date'
df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %I:%M:%S %p')

Sample:

样本:

df = pd.DataFrame({'Closed Date':['05/01/2016 05:10:10 AM', 
                                  '05/01/2016 05:10:10 AM', 
                                   np.nan]})

col='Closed Date'
df[col] = pd.to_datetime(df[col], format='%m/%d/%Y %I:%M:%S %p')
print (df)
          Closed Date
0 2016-05-01 05:10:10
1 2016-05-01 05:10:10
2                 NaT

print (df.dtypes)
Closed Date    datetime64[ns]
dtype: object