在 python pandas 中创建新的日期列
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/55145108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Create new date column in python pandas
提问by caddie
I'm trying to create a new date column based on an existing date column in my dataframe. I want to take all the dates in the first column and make them the first of the month in the second column so:
我正在尝试根据数据框中的现有日期列创建一个新的日期列。我想获取第一列中的所有日期,并将它们设为第二列中的第一个月份,因此:
03/15/2019 = 03/01/2019
03/15/2019 = 03/01/2019
I know I can do this using:
我知道我可以使用以下方法做到这一点:
df['newcolumn'] = pd.to_datetime(df['oldcolumn'], format='%Y-%m-%d').apply(lambda dt: dt.replace(day=1)).dt.date
My issues is some of the data in the old column is not valid dates. There is some text data in some of the rows. So, I'm trying to figure out how to either clean up the data before I do this like:
我的问题是旧列中的某些数据不是有效日期。某些行中有一些文本数据。所以,我试图弄清楚如何在我这样做之前清理数据:
if oldcolumn isn't a date then make it 01/01/1990 else oldcolumn
如果 oldcolumn 不是日期,则将其设为 01/01/1990 否则为 oldcolumn
Or, is there a way to do this with try/except?
或者,有没有办法用 try/except 做到这一点?
Any assistance would be appreciated.
任何援助将不胜感激。
回答by JoergVanAken
At first we generate some sample data:
首先我们生成一些样本数据:
df = pd.DataFrame([['2019-01-03'], ['asdf'], ['2019-11-10']], columns=['Date'])
This can be safely converted to datetime
这可以安全地转换为 datetime
df['Date'] = pd.to_datetime(df['Date'], errors='coerce')
mask = df['Date'].isnull()
df.loc[mask, 'Date'] = dt.datetime(1990, 1, 1)
Now you don't need the slow apply
现在你不需要慢 apply
df['New'] = df['Date'] + pd.offsets.MonthBegin(-1)
回答by Erfan
Try with the argument errors=coerce
.
This will return NaT
for the text values.
试试这个论点errors=coerce
。这将返回NaT
文本值。
df['newcolumn'] = pd.to_datetime(df['oldcolumn'],
format='%Y-%m-%d',
errors='coerce').apply(lambda dt: dt.replace(day=1)).dt.date
For example
例如
# We have this dataframe
ID Date
0 111 03/15/2019
1 133 01/01/2019
2 948 Empty
3 452 02/10/2019
# We convert Date column to datetime
df['Date'] = pd.to_datetime(df.Date, format='%m/%d/%Y', errors='coerce')
Output
输出
ID Date
0 111 2019-03-15
1 133 2019-01-01
2 948 NaT
3 452 2019-02-10